top of page
Keyboard%20and%20Mouse_edited.jpg

HI I’M DANIEL MARCOUS

Home: Welcome
Organized Desk
MY STORY

Hi! I’m Daniel, a Data Wizard - doing magic with data 🧙

I love bleeding edge tech & science, especially if it has to do with data.

I'm passionate about using innovative tech for solving real world existing problems.

I'm currently Co-Founder | CTO @April, using tech to solve tax and save people time & money.

Previously spent 7 years @Google where I did everything from IC to TL, manager, AI strategy for Google Cloud, data science lead & CTO @Waze.

Most recent was :

  • Founding and acting member of Waze's CTO office

  • Founding and leading Waze's data guild (data engineering, science, product analytics and business analytics)

  • Rendering Waze intelligent using ML

Before that I’ve spent a few years at Israeli Defense Forces, founding and leading the first big data team (now an entire division), doing cyber intelligence using ML and founding center of excellence for tech innovation.

As a personal goal, I'm highly committed to advancing the Israeli data science and big data communities.

Other than that I spend time perfecting my mixology skills.

Home: Bio

FEATURED

What I'm working on and excited about at the moment

Home: Text
Home: Selected Work
Image by Danial RiCaRoS

WORK EXPERIENCE

What I’ve Done

Home: Experience

August 2021 - Present

CO-FOUNDER | CTO
@APRIL

Solving Tax with Tech.
Backed by Team8 Fintech.

February 2020 - September 2021

CTO & DATA SCIENCE LEAD
@GOOGLE, WAZE

Lead data science and big data engineering as area tech lead for Waze (ATL, partially hands on).
Co-founder & member of Waze's office of the CTO.
Lead tech vision and empower technological excellence across the company.
Review and approve system design.
Lead AI & data strategy in partnership with Google Cloud

September 2018 - September 2021

DATA SCIENCE MANAGER @GOOGLE, WAZE

Data Science Manager & Data Wizard - Practitioner of Data science, Analytics & (Big) Data Engineering.
Leading a team of data scientists in the field of Advertising & Monetization (team management & hands-on tech leading).

November 2014 - February 2020

TECH LEAD & DATA WIZARD
@GOOGLE, WAZE

Leading technical aspects of company data science, big data engineering & analytics.

January 2013 - October 2014

BIG DATA LEADER & CTO | FOUNDER & LEADER OF CENTER OF EXCELLENCE
@IDF, MAMRAM

Founded and led the first team of big data engineers in the IDF.
Data CTO - assessing and studying new technologies in the field.

Providing big data and analytics solutions for top IDF projects.
Cross IDF expert on: big data engineering (Hadoop), NoSQL, big data visualization etc.

May 2009 - March 2013

DBA, DATA ENGINEER & TEAM LEAD
@IDF, MAMRAM

Senior member and later leader (manager) of a data engineering team specializing  ETL technologies and database administration.

Image by Roman Mager

ACADEMIC EXPERIENCE

What I've [Formally] Learned

Home: Education

2016 - 2020

MSC, BEN-GURION UNIVERSITY OF THE NEGEV

Master of Science in Information Systems Engineering - Data Science).
Master dissertation in machine learning - Clustering of Big Geospatial Data.

2011 - 2015

BA, THE OPEN UNIVERSITY OF ISRAEL

Double degree in management and computer science.
First year- president honors.
Second, third year and overall - dean honors.

Graduate Cum Laude.

September 2008 - April 2009

SOFTWARE ENGINEERING, SCHOOL FOR COMPUTER PROFESSIONS, IDF

Intensive & prestige IDF course for software engineering.

Various

ONLINE

Stanford University - Statistical Learning
Coursera - Bayesian Statistics
Coursera - Machine Learning
Johns Hopkins (Coursera) - Data Science Specialization

Laptop Writing

PUBLISHED WORK

Home: Text

PROJECTS

I Contribute

Matching the best rider-driver couples for Waze Carpool using machine learning

Waze's motorcycle navigation mode - creating a new route recommendation mechanism and ETA prediction models specialised for motorcycle drives using machine learning.

Kaggle Israel community group.

Running regular monthly meetups at https://www.meetup.com/DataHack/events/

Kaggle IL Offering :

  • Come and work together (or alone) on a live, on-going Kaggle competition.

  • Receive (if you want it) mentoring and tips & tricks from Kaggle masters, Kaggle Days (Paris) competition winners and top Israeli competitive ML dogs and experts.

  • Be a part of the first Israeli community in the field of competitive machine learning.

  • Share knowledge with other experts around competitive ML and learn bleeding edge ML techniques by studying Kaggle Kernels and hearing talks from Kaggle Masters.

The Association for The Advancement of Data Science In Israel.
Datahack is an Israeli non-profit dedicated to the advancement of data science and machine learning in Israel. We focus on strengthening academia-industry ties, data science literacy and education, intra-community cohesion and knowledge sharing and empowerment of underrepresented populations.

Home: List

CODE

I create

Provide a set of tools to make working with geospatial objects quick and painless. These tools were designed with S2 objects (Google's "geometry on a sphere" abstractions) in mind as the leading data structure to be used when working with geospatial data.

A novel distributed implementation for: k Betweenness Centrality (kBC) algorithm for Spark using GraphX.

Fun facts :

  1. Used in production at several companies

  2. Taught at several university courses

  3. 39 Github starts

How it works : shorturl.at/efBCP

Provides deepboost models training, evaluation, predicting and hyperparameter tuning using grid search and cross validation.
Based on Google's Deep Boosting algorithm by Cortes et al.

Over 19k package downloads !

An implementation of GloVe model for learning word representations for big text corpuses distributed with Apache Spark.

Based on the original implementation : https://github.com/stanfordnlp/GloVe

Distributed Density based Geospatial Clustering of Applications with Noise over large datasets, using Apache Spark

First place solution for the Armis DataHack 2019 Challenge - Devices Gone Rogue.
Anomaly detection in network data using self supervised learning

Utilities for data science work, including templated for EDA, cleaning, modeling, pipelining.
Including notebooks and importable Python utilities with advanced scikit-learn compatible transformers I commonly use.

Music recommender system based solely on song audio using Wavenet embeddings.

Mac workstation config from days to minutes

Home: List

COURSES

I Teach

5 hours

During this course you will take your first steps as a data scientist. By the end of the training you would have already trained your very own model on a real world dataset, and be able to use it for predictions.

 

This covers both basic theoretical background and practical skills required to successfully tackle your first data science project!

The content in this workshop covers the preliminary concepts & skills necessary for data science work. Although considered “preliminary” (a must have) we will cover them with the depth necessary to fully understand and utilize later.

These include the data science workflow, ML task types (or - what can / should I do with ML), popular ML algorithms (models), gradient descent (what “training” a model means), avoiding overfitting (generalisation, complexity issues, validation etc.) and hyper parameter tuning.

We will go over simple python code for all of the above and later in the workshop use these to compete (together or alone) in a Kaggle competition to practice this for real. 

Presentation : https://goo.gl/i8ttz9
Code : https://goo.gl/9jTDb1

5 hours

During this course you will gain a basic understanding of TensorFlow and practice basic coding with TensorFlow 2.0

We will cover :

1. What is TensorFlow (Why TF 2.0 >> TF 1.0)

2. High Level APIs (Keras & Estimators)

3. TensorFlow Components (tf.data, Checkpoints, Accelerators)

4. TensorFlow Hub & Transfer Learning basics 

5. TensorBoard Overview

6. TFX useful components overview (tfdv, tfma)

We will have many code examples.

You'll get an understanding of what they

do, how to change them and how to use them for your data.

Presentation : tiny.cc/tf2-101-preso

Code : tiny.cc/tf2-101-lab



2 hours

Practical introduction and applications with code to preprocessing for machine learning.

We will use scikit transformers in this course.

Presentation : https://goo.gl/q6a376
Code : https://goo.gl/XNmkBW

2 hours

Practical introduction and applications with code to machine learning model tuning.

We will go over hyperparameter tuning, evaluation schemes, regularization and more.

Presentation : https://goo.gl/6nnVpy
Code : https://goo.gl/UXWSWh

2 hours

Practical applications of advanced ML models and methods.

We will go over ensembling, boosting, transfer learning and autoML.

Presentation : https://goo.gl/XDHCiV
Code : https://goo.gl/u1zTf3

Home: List

PRESENTATIONS

I Speak

Overview of current technologies for deepfake detection.

Tips & tricks in relation to this Kaggle competition

Latest / Buzziest / Most Innovative / Most Promising in software engineering of 2019.

Based on :

  • work of leading companies (Netflix, Google, Spotify…)

  • Trends in tech world

  • Opinions (of the people that set the tone - e.g. Martin Fowler)

The world of transportation is radically changing.

It is an industry with immense technological challenges, most of which are AI related.

In the current paste and major active industry players, it will become unrecognisable in following years.
In this talk I aim to cover the different fields that it includes, data science related problems that it poses, and current state of the art solutions.

The focus of this talk will be smart cities, which multiple teams @Google work on, including mine and myself.

I will present my own work and other smart city topics research and solutions by my counterparts at Google and Uber.

youtube.com/watch?v=en5mzFEdwdI

HACKATHON WINS

Overview of the Google's S2 library for two-dimensional projections on a three-dimensional sphere (similar to a globe).

How to create and maintain a production-ready BIG ML Workflow - From Zero to Hero.

Intro to cloud AI platform notebooks, creating a Google Cloud Platform account and redeeming credit.

Big data real time architectures -
How do to big data processing in real time?
What architectures are out there to support this paradigm?
Which one should we choose?
What Advantages / Pitfalls they contain.

An overview of the distributed database landscape - what it is, how it works, who uses it and for what purpose.

Analyzing and explaining the intersection between the fields of big data and data visualization including domain theory and practical examples.

Home: List
Image by Miguel Maldonado

MIXOLOGY

Home: Text
Home: Instagram

CONTACT ME

Daniel Marcous

+972-547-229760

  • LinkedIn
  • YouTube
Home: Contact
Home: Blog Feed
bottom of page