Tag: Data science
-
How to perform K-means clustering with Python in Scikit?
Posted on June 7, 2020, Level intermediate Resource Length medium
While deep learning algorithms belong to today's fashionable class of machine learning algorithms, there exists more out there. Clustering is one type of machine learning where you do not feed the model a training set, but rather try to derive characteristics from the dataset at run-time in order to structure the dataset in a different way. It's part of the class of unsupervised machine learning algorithms. By Christian Versloot.
Tags python data-science analytics big-data
-
5 Useful jq commands to parse JSON on the CLI
Posted on June 5, 2020, Level beginner Resource Length long
JSON has become the de facto standard data representation for the web. It's lightweight, human-readable (in theory) and supported by all major languages and platforms. However, working on the CLI with JSON is still hard using traditional CLI tooling. By Fabian Keller.
Tags json big-data data-science programming software
-
Build your first data warehouse with Airflow on GCP
Posted on June 2, 2020, Level intermediate Resource Length medium
What are the steps in building a data warehouse? What cloud technology should you use? How to use Airflow to orchestrate your pipeline? By Tuan Nguyen.
Tags google cloud gcp big-data cio data-science
-
The nature of machine learning projects
Posted on May 29, 2020, Level beginner Resource Length short
Michael Ohlsson article about building a data-driven product. Building a data-driven product differs in many ways from how one would create a more conventional software product. A machine learning system is still a software system, but the process to develop the system is different.
Tags machine-learning big-data data-science
-
Change data capture with Debezium: A simple how-to
Posted on May 19, 2020, Level intermediate Resource Length long
Eric Deandrea wrote this piece about one question that always comes up as organizations moving towards being cloud-native, twelve-factor, and stateless: How do you get an organization's data to these new applications?
Tags software-architecture streaming apache data-science queues
-
Convolutional neural network implementation for car classification
Posted on May 18, 2020, Level advanced Resource Length long
Convolutional Neural Networks (CNN) are state-of-the-art Neural Network architectures that are primarily used for computer vision tasks. CNN can be applied to a number of different tasks, such as image recognition, object localization, and change detection. By Dr. Evan Eames and Henning Kropp.
Tags big-data data-science azure learning
-
The 3 essentials for properly setting up Google Analytics conversion tracking
Posted on May 15, 2020, Level intermediate Resource Length long
We asked 48 experts to share their most useful tips, tricks, and tools for properly setting up and tracking website conversions via Google Analytics. Written by Belynda Cianci.
Tags analytics big-data cio data-science
-
Agile and Intelligent Locomotion via Deep Reinforcement Learning
Posted on May 8, 2020, Level advanced Resource Length long
Recent advancements in deep reinforcement learning (deep RL) has enabled legged robots to learn many agile skills through automated environment interactions. In the past few years, researchers have greatly improved sample efficiency by using off-policy data, imitating animal behaviors, or performing meta learning. Posted by Yuxiang Yang and Deepali Jain, AI Residents, Robotics at Google.
Tags machine-learning app-development big-data data-science
-
How open-source medicine could prepare us for the next pandemic
Posted on May 3, 2020, Level beginner Resource Length long
The old drug discovery system was built to benefit shareholders, not patients. But a new, Linux-like platform could transform the way medicine is developed—and energize the race against COVID-19. By Ruth Reader, writer for Fast Company.
Tags miscellaneous cio agile cloud data-science
-
How to use Roam Research: a tool for metacognition
Posted on April 24, 2020, Level beginner Resource Length long
A few weeks ago I discovered Roam which brands itself as "a note-taking tool for networked thought." Let's have a look at how to use Roam Research to achieve your personal growth goals. Written by Anne-Laure Le Cunff.
Tags data-science big-data management machine-learning software
-
Google engineers 'mutate' AI to make it evolve systems faster than we can code them
Posted on April 23, 2020, Level beginner Resource Length short
Much of the work undertaken by artificial intelligence involves a training process known as machine learning, where AI gets better at a task such as recognising a cat or mapping a route the more it does it. Now that same technique is being use to create new AI systems, without any human intervention. By David Nield.
Tags data-science big-data google machine-learning
-
Track, Store and Analyze granular Page Performance data: a practical guide
Posted on April 18, 2020, Level beginner Resource Length medium
In this post, author guides you through all steps needed to collect, process, and analyse the Navigation Timing results of all your web site visitor's page views. Written by Jules Stuifbergen.
Tags analytics web-development big-data miscellaneous cio data-science