Schlagwort: python
-
From Structured Query Languages (SQL) to Dataframe Languages (DL)
Structured Query Languages (SQL) exists since the 1970s and have been first standardized around 1986 by the American National Standards Institute (ANSI). Their purpose was to have a human-understandable language to query data in tables in database management systems. This means SQL is a domain-specific language. Much later they have been also adopted to query…
-
Collaborative Data Science: About Storing, Reusing, Composing and Deploying Machine Learning Models
Why is this important? Machine Learning has re-emerged in recent years as new Big Data platforms provide means to use them with more data, make them more complex as well as allowing combining several models to make an even more intelligent predictive/prescriptive analysis. This requires storing as well as exchaning machine learning models to enable…
-
Automated Machine Learning (AutoML) and Big Data Platforms
Although machine learning exists already since decades, the typical data scientist – as you would call it today – would still have to go through a manual labor-intensive process of extracting the data, cleaning, feature extraction, regularization, training, finding the right model, testing, selecting and deploying it. Furthermore, for most machine learning scenarios you do…
-
Big Data Lab in the Cloud with Hadoop+Spark+R+Python
This is an update of the second big data lab for the cloud. Similar to previous versions, this document described how you can create a Big Data Lab in the cloud on Amazon EMR. Besides some major upgrades to the newest Amazon Hadoop AMI (3.6.0) Spark (1.3.0) and R, it includes now also the possibility…