I am a data scientist with extensive experience in designing and implementing software solutions to automate complex tasks. I gained my expertise by obtaining a doctorate in artificial intelligence and through several consulting projects I have been engaged in. If you have data that needs to be analyzed or need advise on how to architect software systems to support your (data-centric) business I'd be most enthusiastic to hear from you.
What can I do for you ?
Daniël Reichman, Ph.D.
I am a data scientist with interest in applying my knowledge to real-world problems. I currently live in Durham, North Carolina, where I am a Research Scientist at Duke University applying artificial intelligence methods to landmine detection under a US army grant at the Applied Machine Learning Lab. I have entrepreneurial aspirations in the start-up realm through the university and beyond. Other than, I come from Belgium, I enjoy weight lifting and mountain hiking.
I have been programming for over a decade and have extensive experience in designing and implementing applications in several programming languages and paradigms
which I detail in the "Projects" tab below. During my time in this field I have encountered a vast range of problems of varying degrees of open-endedness. In some instances it is to design a machine
learning algorithm to solve a particular task (e.g., identify landmines in subsurface radar imagery, identify building rooftops in satellite imagery). In other instances I was approached to design and
implement a software solution to assist a non-technical person in accomplishing their research goals
(see Projects > Textual Analysis and
Projects > Actuarial Work).
In yet other instances I have architected and implemented complex coding frameworks to systematize the coding effort on projects with big-data needs to speed up the time other project members could
implement solutions and minimize systematic errors
(see Projects > Software Architecture).
For each project, I have regularly presented my materials to technical audiences via powerpoint presentations, academic papers, poster presentations, and in person talks.
The task is to identify the presence of landmines in previously unseen locations, automatically, quickly, and accurately. The data is collected using an electromagnetic modality that measures the composition of the subsurface. This data can be visualized as imagery and therefore, my research focuses on developing software that uses techniques from computer vision, deep learning, image processing, machine learning, and statistical learning theory. I have worked on this problem at Duke University for my Ph.D. and am currently a Research Scientist working on this project.
Example of the automated detection system running on data from the subsurface. The subsurface data can be visualized as imagery in which the buried explosive threats typically appear with a hyperbolic shape. The detection algorithm receives the data column by column and automatically determines whether the data it has collected corresponds to what it has learned to recognize as a buried threat. The red line represents the algorithm in action and after it is done processing, it will provide a prediction of buried threat presence at that location. The dark regions correspond to low probability and the light regions correspond to high probability of buried threat presence.
Several projects of mine have involved color imagery for which I have developed new methods and applied existing methods to solve the problem. My experience in image processing spans the use of hand-crafted feature design, feature learning (e.g., dictionary learning) as well as using convolutional neural networks and deep learning. These methods were applied to natural imagery and to the problem of rooftop detection in satellite imagery (color and height maps).
The top row shows examples of satellite images captured over different US cities. The bottom row shows the output of a rooftop detector using deep learning algorithms. The white spots indicate locations where the algorithm estimates rooftop presence.
Much of my work is to write software to run scientific experiments on the dataset at hand. To ensure that experiments are repeatable and to minimize errors in experiment implementation, I have designed large frameworks of code to support that goal. These frameworks have increased productivity throughout the lab by systematizing the workflow for other members so they may (1) share code more easily, (2) minimize the rewriting of code, and (3) reduce the number of bugs in the code. I have written frameworks for four different projects that are used by multiple people and have been deployed in different computing environments (cloud, different operating systems, varying hardware). They have had to meet various degrees of data sensitivity and for use in analyzing big data as well fine-grained detailed analysis.
Illustration of a part of the class hierarchy in the framework that underpins our work on the landmine detection projects
Software to analyze large archives of textual documents is challenging, partly, because such documents are not necessarily structured in any particular format. I was approached by a researcher of charitable foundations who planned to write a book on the 60,000 documents spanning from the years 1986-2008 of that foundation. This required developing software to automatically generate a taxonomy of the documents. This taxonomy was then used to identify word clouds and common topics across the documents and across the years. I also made a GUI to be able to browse through the topics in the corpus in a convenient manner.
The project was split into three parts where the first part is completed with a book published on the topic. The publication for the second part is currently in preparation.
Screenshot from the Graphical User Interface (GUI) that was developed for the purpose of visualizing and analyzing the set of 60,000 documents.
At my summer internship at New York Life Insurance Company, I applied machine learning to systematically and drastically reduce the run-time of models that ensure the company's solvency (i.e. does not go bankrupt). An insurance company has many liabilities (the policies it issues) that it must match with assets (stocks, bonds, etc) to ensure its ability to pay off the policies. The model determines whether the given set of assets, under different interest rate scenarios over time, will balance out against the liabilities. I worked on a clustering algorithm that can summarize the assets while maintaining model accuracy. This reduced the run-time of the model to a few hours down from a few days.
The flowchart corresponding to the Asset Liability Modeling (ALM) to compute the company's solvency under different economic scenarios (e.g., low interest rates).
Every once in a while I chance on something very interesting that I would like to share in some depth. The topics will probably have something to do with my research on land mine detection, machine learning, or programming, though not necessarily. I will aim to make every post self-contained and understandable. As always, feel free to get in touch with me with any questions or comments!
If you want hints or discuss a solution to one of the problems, you can contact me. New problems added!
No fancy math is required to solve any of these problems.
$$\sqrt[\leftroot{9}\uproot{3}8]{2207 - \displaystyle\frac{1}{2207 -
\displaystyle\frac{1}{2207 - \displaystyle\frac{1}{2207 - ...}}}}$$
as \(\displaystyle\dfrac{a + b\sqrt{c}}{d}\) where \(a,b,c,d \in \mathbb{R}\)
Problem 5
There are \(3\) jars each with an incorrect label. Each jar contains \(100\) beans.
One jar has \(100\) white beans, another has \(100\) black beans, and the third has
\(50\) white and \(50\) black beans. WHat is the least amount of beans required to draw and
from which jar to know with absolute certainty which jar is which?
Problem 6 (New!)
\(20\) prisoners are given a shot at their redemption.
They are told that on the next day they will be lined up such that they could only see all the people in front of them. Each prisoner will be given a hat, either black or white (assume an infinite supply of hats), and each prisoner must guess the color hat he or she is wearing. The last person goes first,
and everyone can hear his response and the consequence (being let free or killed).
Is there a way for the prisoners to collaborate, to save more than \(10\) people?
Problem 7
Three men decide to go fishing on a remote island.
It gets too late for them to go home so they decide to sleep on the island and divide what was caught the next day.
The first guy can't fall asleep so he decides to count the fish, sees there is one too many to divide evenly by 3 so he throws
one fish out, takes his share and goes home. The second guy wakes up a little later and wants to go home. He doesn't realize that one has left, he counts the fish and realizes that there is one too many to divide evenly by 3, so he throws one out, takes his share and goes home. In the morning the third guy wakes up and wants to go home.
He doesn't realize hat both his buddies left him, so he counts the fish and realizes that there is one too many to divide evenly by 3,
so he throws one fish out, takes his share and goes home.
How many fish did the three guys start off with in order for this to be possible?
Is your answer unique?