Tag: Data science
-
Trying out Containerized Applications on Apache Hadoop YARN 3.1
Posted on May 20, 2018, Level intermediate Resource Length medium
Shane Kumpf & Vinod Kumar Vavilapalli & Saumitra Buragohain from Hortonworks wrote series of articles about Hadoop. This is the 5th blog of this seres and in this blog, they will explore running Docker containers on YARN for faster time to market and faster time to insights for data intensive workloads at scale.
Tags big-data data-science database
-
Intuitive guide to data structures and algorithms
Posted on May 17, 2018, Level beginner Resource Length long
Excellent, simple and user-friendly guide to data structures and algorithms by interviewcake.com. Interview Cake is a study tool that preps software engineering candidates for programming interviews. Created by Parker Phinney, ex-Googler who also worked in a handful of startups.
Tags programming data-science
-
How to build a mini supercomputer for under $100
Posted on April 18, 2018, Level beginner Resource Length medium
An article by Daniel Oberhaus in which he offers a quick inside how Wei Lin built a scalable computing cluster comprised of $7 chips. Github user Wei Lin has demonstrated, it's possible to make a home made computing cluster that doesn't break the bank.
Tags programming cloud data-science agile
-
Everything you need to know about tree data structures
Posted on April 11, 2018, Level intermediate Resource Length long
Article by author TK focusing on data tree structures. If you are pursuing a Computer Science degree, you have to take a class on data structure. You will learn about hash tables, linked lists, queues, and stacks. Those data structures are called "linear" data structures because they all have a logical start and a logical end. However, trees and graphs don't store data linear. Both data structures store data in a specific way.
Tags programming search data-science
-
Supercharging visualization with Apache Arrow
Posted on January 7, 2018, Level beginner Resource Length medium
Article on KDnuggets™ about how Apache Arrow provides a new way to exchange and visualize data at unprecedented speed and scale. Despite the fact that interactive visualization of large data sets on the web has traditionally been impractical.
Tags big-data analytics data-science big-data
-
A primer on deep learning
Posted on December 29, 2017, Level beginner Resource Length medium
Post written by Jeremy Fain -- the CEO and co-founder of Cognitiv, the first neural network technology. In it he addresses what deep learning, machine learning and artificial intelligence is.
Tags big-data data-science
-
AI turns design sketches into source code
Posted on October 27, 2017, Level beginner Resource Length long
Dimitar Mihov via [tnw](https://thenextweb.com) published article about Artificial Intelligence (AI) implemented and built by Airbnb that turns design sketches into product source code. The company is currently developing a new AI system that will empower its designers and product engineers to literally take ideas from the drawing board and turn them into actual products almost instantaneously.
Tags big-data programming data-science
-
Apache Spark natural language processing library
Posted on October 22, 2017, Level beginner Resource Length long
Excellent community blog and effort from the engineering team at John Snow Labs, explaining their contribution to an open-source Apache Spark Natural Language Processing (NLP) library. Apache Spark is a general-purpose cluster computing framework, with native support for distributed SQL, streaming, graph processing, and machine learning.
Tags big-data data-science
-
Tuning Your DBMS Automatically with Machine Learning
Posted on June 24, 2017, Level intermediate Resource Length medium
Dana Van Aken, Geoff Gordon, and Andy Pavlo from Carnegie Mellon University guest blog post on AWS demonstrates how academic researchers can leverage AWS Cloud Credits for Research Program to support their scientific breakthroughs.
Tags machine-learning data-science database
-
Data Exploration with Python, Part 1
Posted on January 26, 2017, Level intermediate Resource Length long
Tony Ojeda witnessed the lack of structure in conventional approaches in Exploratory data analysis, so he decided to document his own process in an attempt to come up with a framework for data exploration.
Tags big-data data-science
-
Building a Data Science Portfolio: Machine Learning Project Part 1
Posted on January 23, 2017, Level beginner Resource Length long
Vik Paruchuri - Dataquest's founder has put together a fantastic resource on building a data science portfolio.
Tags database machine-learning data-science