【好文翻译】一份快速手册:用R代码做数据探索的11步
陆勤译
Introduction
简介
If you wish to build an impeccable predictive model, trust me, neither any programming language nor any machine learning algorithm can award it to you unless you perform data exploration.
如果你想构建一个完美的预测模型,相信我,除非你执行了数据探索,否则任何编程语言和机器学习算法对你(模型)没有好处。
Just like a baby learns to walk before running, every data scientist should learn to explore data prior to getting accustomed to algorithms. Data Exploration has paramount importance in predictive modeling.
就像一个婴儿在跑步之前要学会走路,每一位数据科学家在习惯算法之前应该学会探索数据。数据探索对于预测模型至关重要。
Data Exploration not only uncovers the hidden trends and insights, but also allows you to take the first steps towards building a highly accurate model. Considering the popularity of R Programming and its fervid use in data science, I’ve created a cheat sheet of data exploration stages in R. This cheat sheet is highly recommended for beginners who can perform data exploration faster using these handy codes. All you need to do is, customize the codes according your need.
数据探索不仅揭示隐藏的趋势和见解,也让你第一步建立精准的模型。鉴于R语言的流行和在数据科学方面的广泛使用,我创建一份用R做数据探索各阶段的快速手册。这份快速手册强烈推荐给初学者以使用这些简便的代码快速地执行数据探索。您所需要做的是,根据您的需要定制您的代码。
【陆勤看点】数据探索是数据分析科学过程里面很重要的一个环节,数据探索是认识原始数据真相的重要手段。
原文链接:http://www.analyticsvidhya.com/blog/2015/10/cheatsheet-11-steps-data-exploration-with-codes/
陆勤微信:luqin360
中国数据人QQ群:290937046
数据人R语言QQ群:484784338
数据人PythonQQ群:434146007
请关注“恒诺新知”微信公众号,感谢“R语言“,”数据那些事儿“,”老俊俊的生信笔记“,”冷🈚️思“,“珞珈R”,“生信星球”的支持!