Tag: Big data
-
Finding CRAN packages right from the R console
Posted on June 25, 2019, Level intermediate Resource Length short
The article from Joachim Zuckarelli about working woth Rlang. Currently, there are more than 14,000 R package contributions on CRAN providing R with an unparalleled wealth of features. The downside of the large and increasing amount of packages is that it becomes increasingly difficult to find the right tools to tackle a specific problem.
Tags programming big-data data-science
-
Image recognition in Python with TensorFlow and Keras
Posted on June 14, 2019, Level intermediate Resource Length medium
One of the most common utilizations of TensorFlow and Keras is the recognition/classification of images. If you want to learn how to use Keras to classify or recognize images, this article will teach you how.
Tags python big-data data-science
-
How to create histogram in Rlang
Posted on May 22, 2019, Level intermediate Resource Length short
In this article the author will show you how to create histogram in R using ggplot2 package. Written by Data Sharkie. When we get a new dataset for our analysis or research, often we would like to learn about the frequency of occurrence distribution of the variable of interest.
Tags analytics miscellaneous big-data cio data-science
-
Building self-served ETL pipeline for third-party data ingestion
Posted on April 18, 2019, Level intermediate Resource Length medium
An article by Nikolaos Tsipas from Skyscanner with help of colleagues Omar Kooheji and Michael Okarimia about how to solve the puzzle when there is a need to import datasets from external sources, and make them available for querying. Examples of imported data include: analytics metrics, advertising data, and currency exchange rates, all of which are used by Skyscanner engineers and data scientists.
Tags big-data data-science software-architecture
-
Google's EdgeTPU benchmarked vs Intel's Movidius
Posted on March 24, 2019, Level beginner Resource Length short
An article written by Frederik Bode about the first benchmark of Google's EdgeTPU Dev Board is in. Read about comparison is made against Intel's (first generation) Movidius Neural Compute Stick, and Google is the clear winner regarding inference time.
Tags big-data data-science analytics machine-learning
-
The data science behind Natural Language Processing
Posted on March 22, 2019, Level beginner Resource Length medium
John Thuma published this piece about the data science behind Natural Language Processing (NLP). Human communication is one of the most fascinating attributes of being sentient. We communicate in a variety of ways including speech and written symbols.
Tags miscellaneous big-data data-science learning
-
Managing analysis workflows in geospatial data science with GNU Make
Posted on March 3, 2019, Level intermediate Resource Length long
Martà Bosch wrote this guide how to go about using Jupyter Notebooks while using iterative approach to both data analysis and software development. He will also explain how to avoid some bad practices. Many issues can be settled by choosing helpful file names, good organization, documentation and source control of the code.
Tags big-data machine-learning data-science miscellaneous python
-
Understanding stabilising experience replay for deep multi-agent reinforcement learning
Posted on March 1, 2019, Level advanced Resource Length long
An article by Parnian Barekatain in which she describes some basic concepts in Reinforcement Learning. She also provides you with the link to Udacity's free course on Deep Learning with Pytorch.
Tags big-data machine-learning data-science miscellaneous
-
Handling imbalanced datasets in machine learning
Posted on February 6, 2019, Level intermediate Resource Length long
An article by Bapriste Rocca about handling imbalanced datasets in machine learning. He searches and answer on question what should and should not be done when facing an imbalanced classes problem.
Tags big-data big-data data-science miscellaneous machine-learning
-
Handling imbalanced datasets in machine learning
Posted on February 6, 2019, Level intermediate Resource Length long
An article by Bapriste Rocca about handling imbalanced datasets in machine learning. He searches and answer on question what should and should not be done when facing an imbalanced classes problem.
Tags big-data big-data data-science miscellaneous machine-learning
-
Understand TensorFlow by mimicking its API from scratch
Posted on January 7, 2019, Level beginner Resource Length long
An article by Dominic Elm about learning TensorFlow. TensorFlow is a very powerful and open source library for implementing and deploying large-scale machine learning models.
Tags programming big-data big-data learning machine-learning
-
Understand TensorFlow by mimicking its API from scratch
Posted on January 7, 2019, Level beginner Resource Length long
An article by Dominic Elm about learning TensorFlow. TensorFlow is a very powerful and open source library for implementing and deploying large-scale machine learning models.
Tags programming big-data big-data learning machine-learning