Tag: Data science
-
Introducing real-time data integration for BigQuery with Cloud Data Fusion
Posted on February 4, 2021, Level beginner Resource Length short
Businesses today have a growing demand for real-time data integration, analysis, and action. More often than not, the valuable data driving these actions—transactional and operational data—is stored either on-prem or in public clouds in traditional relational databases that aren't suitable for continuous analytics. By Bhooshan Mogal.
Tags cloud analytics cio google gcp big-data data-science
-
Fitness data needs an AI revolution
Posted on January 29, 2021, Level beginner Resource Length medium
As smart watches and other wearables provide users with sensors to monitor their fitness and health, they are generating a treasure trove of data. But whether all of this information actually contributes to a healthier society is up for debate. By Nicole Ferraro.
Tags big-data cio data-science miscellaneous
-
Ten computer codes that transformed science
Posted on January 24, 2021, Level beginner Resource Length long
From Fortran to arXiv.org, these advances in programming and platforms sent biology, climate science and physics into warp speed. By Jeffrey M. Perkel.
Tags programming data-science learning cio management
-
How I built Machine learning with Amazon Personalize and a Customer Data Platform
Posted on January 23, 2021, Level intermediate Resource Length long
By making off-the-rack machine learning models accessible for anyone to use, cloud ML services like Amazon Personalize help make ML-driven customer experiences available to teams at any scale. By @mparticle.
Tags aws machine-learning learning big-data data-science
-
Improving the performance of your imbalanced machine learning classifiers
Posted on January 17, 2021, Level beginner Resource Length medium
A comprehensive guide to handling imbalanced datasets. By Francis Adrian Viernes.
Tags machine-learning learning big-data data-science
-
NULL values in SQL queries
Posted on December 24, 2020, Level beginner Resource Length medium
This post is about NULL values in SQL, and comes courtesy of my friend and database wizard, Kaley. You should check out his website if you'd like to learn more about SQL, Oracle database, and making queries run faster. By Mitchum.
Tags data-science mysql database programming
-
Six principles for building robust yet flexible shared data applications
Posted on December 23, 2020, Level beginner Resource Length medium
Paul Done brought together a set of techniques he has identified to effectively deliver resilient yet evolvable data-driven applications.
Tags data-science big-data management cio
-
Getting started with distributed TensorFlow on GCP
Posted on December 22, 2020, Level beginner Resource Length medium
For many in the world of data science, distributed training can seem a daunting task. In addition to building and thoughtfully evaluating a high-quality ML model, you have to be aware of how to optimize your model for specific hardware and manage infrastructure. By Nikita Namjoshi.
Tags big-data data-science software gcp google
-
Modelling the time-of-arrival using distributions
Posted on December 21, 2020, Level beginner Resource Length medium
Estimating the time-of-arrival is a common problem in a wide range of settings, e.g. in logistics. This post will show a distribution-based approach that enables us to get more insights about arrival times and how we could use this information for decision-making in the logistics industry. By Jonas Laake.
Tags big-data data-science software
-
How to grid search deep learning models for time series forecasting
Posted on November 29, 2020, Level intermediate Resource Length medium
Grid searching is generally not an operation that we can perform with deep learning methods. This is because deep learning methods often require large amounts of data and large models, together resulting in models that take hours, days, or weeks to train. By Jason Brownlee.
Tags how-to machine-learning big-data data-science
-
Modern Distributed Data Architecture with Event Streams, Stream Processing and Derived Data
Posted on November 12, 2020, Level beginner Resource Length medium
Some of the most interesting projects I worked on at LinkedIn involved building large scale real-time pricing and machine learning products. They required crafting fault-tolerant distributed data architectures to support model training, forecasting and dynamic control systems. By Luthfur Chowdhury.
Tags cloud streaming software-architecture big-data cio data-science
-
Getting started with Python library Numpy
Posted on November 6, 2020, Level beginner Resource Length medium
NumPy is a open source Python library that handles multidimensional arrays and matrices with a huge library of mathematical functions to manipulate arrays. By Shahid Siddique.
Tags json big-data data-science python