Following are the reasons for my preference for R:
- Although It’d be harsh to compare it with other (for the reason that most open source tools are developed by teams of renowned, hard working scientists and researchers), but when it is your work environment, your programming ability and the technology that you are using, R comes out to be absolutely great.
- With R, you have all possibilities of applied data science (and you need to learn the R language, install modules and use them).
- R has both established and experimental algorithms. There are packages for very advanced algorithms, which are developed by highly experienced researchers and professors around the world and, being open source, updated/modified by others in the similar fields and before release of modifed/updated version for public use, these packages are checked by team of developers/scientists/researchers. For example, there are not only classical decision trees (in package “rpart”) but also an innovative approach called conditional trees (in package “party”). Also R has packages for SVM, neural nets, regression, survival analysis, machine learning and what not.
- Most researchers use R for their first and subsequent research publications.
- Latest versions of most of the analytics and visualization tools (e.g. Rapidminer, Tableau, Microsoft Azure, Stata to name a few) are coming with R code integration options.
- R has direct access to most relational databases through ODBC (through packages RODBC, RMySQL, RMongo etc.).
- It has capabilities to connect to social sites (Facebook, Twitter etc.) using APIs and perform high level of Text Mining and Sentiment Analysis. It also has supports for JSON and XML parsing.
- R is also used to connect Hadoop environment for processing large datasets. Using ‘foreach’ package in R it is possible to perform parallel processing with larger (big) datasets.