Tag: Data science

Trying out Containerized Applications on Apache Hadoop YARN 3.1

Posted on May 20, 2018, Level intermediate Resource Length medium

Shane Kumpf & Vinod Kumar Vavilapalli & Saumitra Buragohain from Hortonworks wrote series of articles about Hadoop. This is the 5th blog of this seres and in this blog, they will explore running Docker containers on YARN for faster time to market and faster time to insights for data intensive workloads at scale.

Tags big-data data-science database
Intuitive guide to data structures and algorithms

Posted on May 17, 2018, Level beginner Resource Length long

Excellent, simple and user-friendly guide to data structures and algorithms by interviewcake.com. Interview Cake is a study tool that preps software engineering candidates for programming interviews. Created by Parker Phinney, ex-Googler who also worked in a handful of startups.

Tags programming data-science
How to build a mini supercomputer for under $100

Posted on April 18, 2018, Level beginner Resource Length medium

An article by Daniel Oberhaus in which he offers a quick inside how Wei Lin built a scalable computing cluster comprised of $7 chips. Github user Wei Lin has demonstrated, it's possible to make a home made computing cluster that doesn't break the bank.

Tags programming cloud data-science agile
Everything you need to know about tree data structures

Posted on April 11, 2018, Level intermediate Resource Length long

Article by author TK focusing on data tree structures. If you are pursuing a Computer Science degree, you have to take a class on data structure. You will learn about hash tables, linked lists, queues, and stacks. Those data structures are called "linear" data structures because they all have a logical start and a logical end. However, trees and graphs don't store data linear. Both data structures store data in a specific way.

Tags programming search data-science
Supercharging visualization with Apache Arrow

Posted on January 7, 2018, Level beginner Resource Length medium

Article on KDnuggets™ about how Apache Arrow provides a new way to exchange and visualize data at unprecedented speed and scale. Despite the fact that interactive visualization of large data sets on the web has traditionally been impractical.

Tags big-data analytics data-science big-data
A primer on deep learning

Posted on December 29, 2017, Level beginner Resource Length medium

Post written by Jeremy Fain -- the CEO and co-founder of Cognitiv, the first neural network technology. In it he addresses what deep learning, machine learning and artificial intelligence is.

Tags big-data data-science
AI turns design sketches into source code

Posted on October 27, 2017, Level beginner Resource Length long

Dimitar Mihov via [tnw](https://thenextweb.com) published article about Artificial Intelligence (AI) implemented and built by Airbnb that turns design sketches into product source code. The company is currently developing a new AI system that will empower its designers and product engineers to literally take ideas from the drawing board and turn them into actual products almost instantaneously.

Tags big-data programming data-science
Apache Spark natural language processing library

Posted on October 22, 2017, Level beginner Resource Length long

Excellent community blog and effort from the engineering team at John Snow Labs, explaining their contribution to an open-source Apache Spark Natural Language Processing (NLP) library. Apache Spark is a general-purpose cluster computing framework, with native support for distributed SQL, streaming, graph processing, and machine learning.

Tags big-data data-science
Tuning Your DBMS Automatically with Machine Learning

Posted on June 24, 2017, Level intermediate Resource Length medium

Dana Van Aken, Geoff Gordon, and Andy Pavlo from Carnegie Mellon University guest blog post on AWS demonstrates how academic researchers can leverage AWS Cloud Credits for Research Program to support their scientific breakthroughs.

Tags machine-learning data-science database
Data Exploration with Python, Part 1

Posted on January 26, 2017, Level intermediate Resource Length long

Tony Ojeda witnessed the lack of structure in conventional approaches in Exploratory data analysis, so he decided to document his own process in an attempt to come up with a framework for data exploration.

Tags big-data data-science
Building a Data Science Portfolio: Machine Learning Project Part 1

Posted on January 23, 2017, Level beginner Resource Length long

Vik Paruchuri - Dataquest's founder has put together a fantastic resource on building a data science portfolio.

Tags database machine-learning data-science

Tag: Data science

Trying out Containerized Applications on Apache Hadoop YARN 3.1

Tags big-data data-science database

Intuitive guide to data structures and algorithms

Tags programming data-science

How to build a mini supercomputer for under $100

Tags programming cloud data-science agile

Everything you need to know about tree data structures

Tags programming search data-science

Supercharging visualization with Apache Arrow

Tags big-data analytics data-science big-data

A primer on deep learning

Tags big-data data-science

AI turns design sketches into source code

Tags big-data programming data-science

Apache Spark natural language processing library

Tags big-data data-science

Tuning Your DBMS Automatically with Machine Learning

Tags machine-learning data-science database

Data Exploration with Python, Part 1

Tags big-data data-science

Building a Data Science Portfolio: Machine Learning Project Part 1

Tags database machine-learning data-science