Elon Musk often remarks on how Tesla and SpaceX are a result of thinking about technologies that would have the greatest positive impact on humanity. For me, the big idea has always been using the Internet to fundamentally reinvent government. Humanity faces many great challenges as we move forward. I believe our ability to meet these challenges rests, in large part, on how effectively we can come together to make decisions for the common good. Our best invention for doing this to date is western democracy. Unfortunately it’s pretty crap. I believe better systems for governance are indeed possible - and more than that - we ought to be thinking about what steps we can take to start building towards them now.
I am taking my own steps in this direction by launching Thinklab - a startup with a mission to fix science. My assessment is that the path to a radically better system of governance lies through the creation of a radically better system of advancing science. Why? Because science is horribly broken, I think there’s a realistic pathway to fixing it, and I believe the creation of a next generation platform for determining scientific consensus will provide the building blocks for a new internet-native institution of government. An institution that will enable us to effectively act in the common good and tackle the many challenges we face.
My main aim with this post is to recruit a team of talented people to join Thinklab. Before we get too far - let me briefly introduce myself. I’ve been an entrepreneur my whole life. I’ve founded and sold two startups and once made 500k from a high frequency trading algorithm I developed - and wrote about. For more about me see my Linkedin profile or simply keep reading.
So is science broken? Well, I would assert that since the advent of the internet we’ve had the potential of a drastically more effective system of advancing science. A system where scientists share their questions, ideas, and insights online in real-time while collaborating with a worldwide network of their peers. If you’d like to read more about the potential of such a system I’d highly recommend Michael Nielson’s book: Reinventing Discovery: The New Era of Networked Science.
The internet has brought this incredible potential, but sadly the way we conduct science hasn’t changed much since scientific journals were introduced in the 17th century. For the most part science is still done offline, in silos, with incredible redundancies and delays. Data sets are not published. Software is not published. Negative results are not published. Detailed research methods are not published. Research is often not reproducible and no one is being held accountable. Researchers have a singular focus on publishing papers in peer reviewed journals and anything that distracts from that is simply not worth their time. And the truth is it’s hard to blame them - it’s what they have to do (or at least think they have to do) in order to have a career in science.
So why haven’t these problems been fixed? There’s certainly been a growing awareness of many of the problems I’ve described and the fact is there’s been many tools built trying to encourage things such as getting scientists to openly discuss science on the web. However, they almost all fail. I believe this is because they fail to address the fundamentally broken incentive structure that exists at the core of science. Science has a reward structure that made sense pre-internet but now just works to prevent innovation. I think it can be hard for scientist to even imagine a different system is possible when their entire career has been dependent on their ability to ‘work’ the current system.
As a non-scientist I may have an advantage here. To me, it’s clear that if we’re going to truly fix science we need to fix the system itself. I’m creating Thinklab to do exactly that. It’s all about aligning the incentives of scientists as individuals with what’s best for society and science as a whole. I won’t get into the details of how that will be done for now - we haven’t launched publicly yet and I’d like to keep some of the secret sauce secret. However, I will say that I truly believe this is a problem we can fix. And it’s simply a problem we must fix.
Thinklab will be operated as a for profit company with a mission that goes far beyond profit. I’ve talked a lot about the vision and big picture - but to be clear, launching this venture will require executing a strategy in whatever small area of science will allow us to begin to get user adoption while putting us on a path towards executing on the vision.
Building a team
My goal is to recruit a team of talented people passionate about this mission. I think there is the opportunity here to make a huge positive impact on the world. Here’s how I would describe the kind of people I’d like to attract and the culture I hope to build:
I’m looking to attract people in a variety of areas but I’m particularly looking to attract a cofounder. This person would have strengths that complement my own and I would describe them as follows:
If any of this sounds like you please e-mail me: firstname.lastname@example.org. Let’s have coffee, drinks, or lunch! I’m in San Francisco and will most likely want to keep the company here in the bay area.
If you’d like to get to know me further keep reading. I’ll be sharing some of my beliefs on government, punishment of crime, inequality, and education. I think shared beliefs can be one of the most powerful things that can pull a team together - particularly when those beliefs are not commonly held.
The internet has transformed many industries and areas of life. Why not government? Let’s imagine we are the framers of the U.S. Constitution. But instead of it being the 18th-century it’s actually 2014. Might we not explore how the Internet could be used to create a more functional government? Well guess what folks - this is not a hypothetical - here we are. My view is that even small improvements in the function of government can have massive positive impacts on society. And my belief is that we can actually drastically improve government.
Let’s look at how laws and policies are decided upon now. We have people who get elected to Congress who make decisions without really understanding the issues, who are blinded by partisanship, who are funded by corporate interests, and who have to worry about what will sound good in soundbites through a media that exists mostly to entertain a general public who likely don’t understand the issues either. In short - let’s just agree it’s complete crap.
Let’s try to imagine a more ideal method. I would suggest that for each policy decision we take the top 10 experts in that area and get them to debate the issue in public and come to some kind of consensus. We’d need to make sure the group has high integrity, high empathy, no conflicts of interest, and that they are a well rounded group. They’d need to be held accountable for making decisions in the best interests of society. Let’s see how close we can come to achieving this.
I think we could start by designing a system where policy experts have online profiles and where each expert rates the people that they know by indicating how much respect they have for them in each relevant area of policy. With this data we can then algorithmically determine who the top policy experts are in any given area. This could be done by applying the same basic algorithm that Google uses to rank webpages. Instead of looking at the links between webpages we’ll be looking at the explicit ratings given from person to person in each policy area. Just as Google’s algorithm actually has a lot of layered complexity I’m sure this algorithm would need it as well. An example might be ensuring that a representative group is selected when there are distinct schools of thought.
What’s great is that the system is fundamentally democratic. Each person is fundamentally equal within the system. Individuals only gain or lose influence as a result of the respect given to them by other people who are respected within the system. And I’d suggest respect would need to derive not just from expertise but also from empathy and integrity.
I realize this is not as simple as someone simply creating an algorithm but - as mentioned - I think we ought to be thinking about where we need to end up and working back to figure out what steps will get us there. If humanity is going to flourish as we move forward here on earth we really need better ways of coming together to make decisions for the common good. Either that or we might all have to hope to join Elon Musk on his mars colony!
On punishment of crime
I do not believe society should be in the business of punishing people for the sake of revenge or retribution. Our laws ought to be designed by thinking about the overall outcomes and impacts for society. From this perspective punishment makes perfect sense when it serves as a deterrent or physically prevents further crime. However, if we only concern ourselves with punishing people simply to satisfy a notion of revenge or justice then we are probably not doing what’s in the big-picture best interests of society.
You may feel that some people simply deserve to suffer because of the choices they have made. However, think about this: if you were that person would you have made a different choice? If you had been born with the same DNA, been raised by the same parents, grown up in the same community, and had the same random things happen in your life would you have made a different choice? If your brain was wired the exact same way and was in the exact same state could you have made a different choice?
My view is that we need to have compassion for criminals - chances are they haven’t been so lucky in life. We should let our compassion drive us towards figuring out how we can, as a society, fix the underlying problems that produce criminality. If all we are doing is blaming individuals for poor choices we are not going to make much progress. We need to think about the big picture. Perhaps about poverty, education, mental health, and the war on drugs.
I find it rather disgusting how many people we have locked in jails, how we accept such poor conditions for prisoners, and that the death penalty is a thing that still exists.
It seems to me that at this point in human history, capitalism and the continued advancement of technology is going to produce greater and greater levels of inequality. (Assuming we don’t intervene.) Additionally, I have a strong suspicion that further increases in inequality would be quite bad for society in a variety of ways.
The simple fact of the matter is that the internet, artificial intelligence, and robotics is allowing smaller and smaller groups of people to produce more and more of the things that we value in society. Whereas previously the invention of the car produced many jobs - the invention of the self driving car most certainly will not. All the value being created by taxi drivers and truck drivers will soon get created without them. The only people making money will be the shareholders of companies like Google and Uber as well as the really smart people that they employ.
So while I’m psyched for the arrival of driverless cars and super optimistic about the potential of technology to continue improving peoples lives, I think it would be pretty sad if we allowed all of the benefits to go to just a small segment of the population. Particularly when there are so many people suffering in poverty. I think going forward we will need to adjust the mindset that says capitalism will naturally produce a fair and just distribution of resources. If you’re someone that has a survival of the fittest type of mindset here it probably just means a computer hasn’t taken your job yet. In the future it’s likely artificial intelligence and robotics will make all (or almost all) need for human work obsolete. Then what? It seems unlikely everyone would want to declare themselves unfit for having and enjoying life?
So what can we do? Training people for new jobs may help in the short term. But I don’t think everyone can be a programmer or entrepreneur. I think we are going to need to tackle inequality directly. On a national level an idea might be to fix inequality at some level we think is acceptable and have tax rates automatically adjust to maintain that. So long as we don’t try to enforce too much equality the incentives inherent in capitalism will keep working just fine. As far as spreading the wealth, I think that for efficiency’s sake a good idea might be to scrap all welfare programs and just pay everyone out a base income.
On an international level I’d say we need more people and organizations taking on the mindset of the Bill & Melinda Gates Foundation. They’ve started from the premise that every human life is equal and dedicated their organization to saving as many lives as possible. With such huge inequality in the world it’s possible to have huge effects if you target your efforts appropriately.
For early education I’d want to redefine the purpose as preparing people for life and not just work. We should teach life skills and social skills and not just academic subjects. We should do our best to engender empathy, minimize bullying, and make explicit goals of raising kids into mentally healthy and happy adults.
I think kids should always know why they need to learn what we are teaching them. If we think about this, I think we’ll discover that for many of the things we teach there is simply no reason to make them mandatory. That said, I do believe kids should be exposed to a wide variety of topics so they have a chance to explore, discover, and pursue their own interests.
More important than teaching facts is teaching critical thinking skills. Part of that is developing an understanding of the scientific process so people understand it’s a method for discovering truths and not just another belief system. Science needs to be respected for what it is. In a future where technology gives individuals great power to do harm it’s simply too risky to have large numbers of people susceptible to non-science (non-reality) based belief systems.
The idea of teachers standing up to lecture students should be considered a dead model. Why would we have thousands of teachers worldwide giving the same talks year after year when we can just take the time to produce one superior version that students can watch online? And why stop at simply recording lectures when we could create rich interactive game-type experiences where students can learn while having fun? There’s potential for every student to work at their own pace at a difficulty level just right for them. Teachers would be free to add value through one-on-one mentoring. To read more about these ideas I recommend the Salman Khan’s book The One World School House.
Looking at college - If what we are concerned about is learning and preparing people for jobs then I’d say we have a terribly inefficient system of achieving that. In my opinion people can generally learn much more efficiently sitting at home with a computer (or in a study group) - not paying a dime - than they can sitting in a lecture hall. The problem is employers look for college degrees. What we need to do is decouple learning from the validation of what’s been learned. This would allow the free market to work in education. People would be free to learn in whatever way they thought best. If they don’t want to go into tens of thousands of dollars in debt they wouldn’t have to.
College is about a lot more than just learning and getting a degree though. To this day one of my regrets is not attending a top university. The reason? College is the best place to meet smart people to start companies with and find smart women to date. So can we replicate the whole college experience without the massive cost overhead? Here’s my idea: Someone needs to create some kind of open source university that runs without administrators or professors. There would be tools to help students self-organize into MOOC study groups or perhaps project based learning groups. And as long as there is an admissions process that only lets the best and brightest in I think we’d see that a “degree” from this university would very quickly gain value as a hiring signal.
Thanks for reading! In case you forgot - I’m building a team for Thinklab - please get in touch!
Recently I wrote an article on how I made 500k with machine learning and HFT (high frequency trading). I submitted the article to Hacker News and it ended up receiving 50,000 unique views. WOW! I thought I had something awesome to share but I had no idea that would happen! It also received 673 likes and 327 tweets. Amazing! By comparison, the actual business I’m launching has received just 135 likes so far - boohoo - though I’m confident that will increase soon.
As a result of the commentary on Hacker News as well as other places around the web I realized there were a lot of important things that I failed to mention in my article. I didn’t really target what I wrote to a trader audience so now I want to take some time to answer the most common questions I received.
Q: Was this really a high frequency trading system?
I would describe my system as a combination of market making, very micro-scale statistical arbitrage, and HFT. It seems that the term HFT is not very clearly defined so let me just tell you a few things about what I was doing and you can decide for yourself:
Q: Isn’t this just a case of survivorship bias?
Quite a few people brought this up. I think it’s great to keep something like this in mind because the fact is many people have tried algorithmic trading and surely most have failed. The difference is they are not writing up a blog post that gets viewed by thousands of people - that’s true.
However, it seemed the implication was that I was simply the guy who got lucky. On this point I strongly disagree. I believe some of the doubt stems from the fact that many people didn’t understand the difference between “running a strategy” versus HFT where you’re picking off consistent profits while exploiting small market inefficiencies. To be fair this is partly my fault for not explaining what I was doing properly
In any case I am extremely confident I was not just getting lucky. There was no directional bias to my system so the fact I traded generally through an up market would not account for my results. My program made long and short trades all throughout the day and did just as well on a down day as an up. Position sizes were always small and as mentioned in the article I made upwards of a thousand trades a day and never lost more than $2000 in one day.
Q: What were the indicators you used?
My article glossed over the indicators my system used for predicting price moves partly because I didn’t think they were the most interesting part of what I did and partly because I didn’t want to get into any math. I thought what was cool was that I built a framework that allowed me to experiment with many indicators and quickly see which indicators had a meaningful ability to predict prices.
However, let me try to give a few more details. I’d say i had two types of indicators. The first type was based on the market microstructure (I think that’s the correct term) of the contract I was trading. So for example - if there is more size on the bid versus the offer an up move was considered more likely. An up move would also be considered more likely if there was recent a trade executed on the offer.
The other type of indicator involved looking at markets that were correlated with what I was trading and making predictions based on what was happening in them (ie arbitrage). So essentially if the S&P moved up it’s likely the Russell would move up as well. Most of these indicators were valid at very small time scale (ie milliseconds).
The indicators were indeed more sophisticated than what I describe here but hopefully you get the idea. Also, keep in mind predicting prices was only part of the story - I also relied on market making tricks such as keeping bids in the market if other orders were queued up behind mine but canceling them if not. As mentioned in the article my success required finding “safe” places to bid and offer.
The original article can be viewed here.
This post will detail what I did to make approx. 500k from high frequency trading from 2009 to 2010. Since I was trading completely independently and am no longer running my program I’m happy to tell all. My trading was mostly in Russel 2000 and DAX futures contracts.
The key to my success, I believe, was not in a sophisticated financial equation but rather in the overall algorithm design which tied together many simple components and used machine learning to optimize for maximum profitability. You won’t need to know any sophisticated terminology here because when I setup my program it was all based on intuition. (Andrew Ng’s amazing machine learning course was not yet available - btw if you click that link you’ll be taken to my current project: CourseTalk, a review site for MOOCs)
First, I just want to demonstrate that my success was not simply the result of luck. My program made 1000-4000 trades per day (half long, half short) and never got into positions of more than a few contracts at a time. This meant the random luck from any one particular trade averaged out pretty fast. The result was I never lost more than $2000 in one day and never had a losing month:
(EDIT: These figures are after paying commissions)
And here’s a chart to give you a sense of the daily variation. Note this excludes the last 7 months because - as the figures stopped going up - I lost my motivation to enter them.
My trading background
Prior to setting up my automated trading program I’d had 2 years experience as a “manual” day trader. This was back in 2001 - it was the early days of electronic trading and there were opportunities for “scalpers” to make good money. I can only describe what I was doing as akin to playing a video game / gambling with a supposed edge. Being successful meant being fast, being disciplined, and having a good intuitive pattern recognition abilities. I was able to make around $250k, pay off my student loans and have money left over. Win!
Over the next five years I would launch two startups, picking up some programming skills along the way. It wouldn’t be until late 2008 that I would get back into trading. With money running low from the sale of my first startup, trading offered hopes of some quick cash while I figured out my next move.
A trading API
In 2008 I was “manually” day trading futures using software called T4. I’d been wanting some customized order entry hotkeys, so after discovering T4 had an API, I took on the challenge of learning C# (the programming language required to use the API) and went ahead and built myself some hotkeys.
After getting my feet wet with the API I soon had bigger aspirations: I wanted to teach the computer to trade for me. The API provided both a stream of market data and an easy way to send orders to the exchange - all I had to do was create the logic in the middle.
Below is a screenshot of a T4 trading window. What was cool is that when I got my program working I was able to watch the computer trade on this exact same interface. Watching real orders popping in and out (by themselves with my real money) was both thrilling and scary.
The design of my algorithm
From the outset my goal was to setup a system such that I could be reasonably confident I’d make money before ever making any live trades. To accomplish this I needed to build a trading simulation framework that would - as accurately as possible - simulate live trading.
While trading in live mode required processing market updates streamed through the API, simulation mode required reading market updates from a data file. To collect this data I setup the first version of my program to simply connect to the API and record market updates with timestamps. I ended up using 4 weeks worth of recent market data to train and test my system on.
With a basic framework in place I still had the task of figuring out how to make a profitable trading system. As it turns out my algorithm would break down into two distinct components, which I’ll explore in turn:
Predicting price movements
Perhaps an obvious component of any trading system is being able to predict where prices will move. And mine was no exception. I defined the current price as the average of the inside bid and inside offer and I set the goal of predicting where the price would be in the next 10 seconds. My algorithm would need to come up with this prediction moment-by-moment throughout the trading day.
Creating & optimizing indicators
I created a handful of indicators that proved to have a meaningful ability to predict short term price movements. Each indicator produced a number that was either positive or negative. An indicator was useful if more often than not a positive number corresponded with the market going up and a negative number corresponded with the market going down.
My system allowed me to quickly determine how much predictive ability any indicator had so I was able to experiment with a lot of different indicators to see what worked. Many of the indicators had variables in the formulas that produced them and I was able to find the optimal values for those variables by doing side by side comparisons of results achieved with varying values.
The indicators that were most useful were all relatively simple and were based on recent events in the market I was trading as well as the markets of correlated securities.
Making exact price move predictions
Having indicators that simply predicted an up or down price movement wasn’t enough. I needed to know exactly how much price movement was predicted by each possible value of each indicator. I needed a formula that would convert an indicator value to a price prediction.
To accomplish this I tracked predicted price moves in 50 buckets that depended on the range that the indicator value fell in. This produced unique predictions for each bucket that I was then able to graph in Excel. As you can see the expected price change increases as the indicator value increases.
Based on a graph such as this I was able to make a formula to fit the curve. In the beginning I did this “curve fitting” manually but I soon wrote up some code to automate this process.
Note that not all the indicator curves had the same shape. Also note the buckets were logarithmically distributed so as to spread the data points out evenly. Finally note that negative indicator values (and their corresponding downward price predictions) were flipped and combined with the positive values. (My algorithm treated up and down exactly the same.)
Combining indicators for a single prediction
An important thing to consider was that each indicator was not entirely independent. I couldn’t simply just add up all the predictions that each indicator made individually. The key was to figure out the additional predictive value that each indicator had beyond what was already predicted. This wasn’t to hard to implement but it did mean that if I was “curve fitting” multiple indicators at the same time I had to be careful; changing one would effect the predictions of another.
In order to “curve fit” all of the indicators at the same time I setup the optimizer to step only 30% of the way towards the new prediction curves with each pass. With this 30% jump I found that the prediction curves would stabilize within a few passes.
With each indicator now giving us it’s additional price prediction I could simply add them up to produce a single prediction of where the market would be in 10 seconds.
Why predicting prices is not enough
You might think that with this edge on the market I was golden. But you need to keep in mind that the market is made up of bids and offers - it’s not just one market price. Success in high frequency trading comes down to getting good prices and it’s not that easy.
The following factors make creating a profitable system difficult:
Building a full trading simulation
So I had a framework that allowed me to backtest and optimize indicators. But I had to go beyond this - I needed a framework that would allow me to backtest and optimize a full trading system; one where I was sending orders and getting in positions. In this case I’d be optimizing for total P&L and to some extent average P&L per trade.
This would be trickier and in some ways impossible to model exactly but I did as best as I could. Here are some of the issues I had to deal with:
To refine my order execution simulation what I did was take my log files from live trading through the API and compare them to log files produced by simulated trading from the exact same time period. I was able to get my simulation to the point that it was pretty accurate and for the parts that were impossible to model exactly I made sure to at least produce outcomes that were statistically similar (in the metrics I thought were important).
Making profitable trades
With an order simulation model in place I could now send orders in simulation mode and see a simulated P&L. But how would my system know when and where to buy and sell?
The price move predictions were a starting point but not the whole story. What I did was create a scoring system for each of 5 price levels on the bid and offer. These included one level above the inside bid (for a buy order) and one level below the inside offer (for a sell order).
If the score at any given price level was above a certain threshold that would mean my system should have an active bid/offer there - below the threshold then any active orders should be cancelled. Based on this it was not uncommon that my system would flash a bid in the market then immediately cancel it. (Although I tried to minimize this as it’s annoying as heck to anyone looking at the screen with human eyes - including me.)
The price level scores were calculated based on the following factors:
Essentially these factors served to identify “safe” places to bid/offer. The price move prediction alone was not adequate because it did not account for the fact that when placing a bid I was not automatically filled - I only got filled if someone sold to me there. The reality was that the mere fact of someone selling to me at a certain price changed the statistical odds of the trade.
The variables used in this step were all subject to optimization. This was done in the exact same way as I optimized variables in the price move indicators except in this case I was optimizing for bottom line P&L.
What my program ignored
When trading as humans we often have powerful emotions and biases that can lead to less than optimal decisions. Clearly I did not want to codify these biases. Here are some factors my system ignored:
Since my algorithm made decisions the same way regardless of where it entered a trade or if it was currently long or short it did occasionally sit in (and take) some large losing trades (in addition to some large winning trades). But, you shouldn’t think there wasn’t any risk management.
To manage risk I enforced a maximum position size of 2 contracts at a time, occasionally bumped up on high volume days. I also had a maximum daily loss limit to safeguard against any unexpected market conditions or a bug in my software. These limits were enforced in my code but also in the backend through my broker. As it happened I never encountered any significant problems.
Running the algorithm
From the moment I started working on my program it took me about 6 months before i got it to the point of profitability and begun running it live. Although to be fair a significant amount of time was learning a new programming language. As I worked to improve the program I saw increased profits for each of the next four months.
Each week I would retrain my system based on the previous 4 weeks worth of data. I found this struck the right balance between capturing recent market behavioral trends and insuring my algorithm had enough data to establish meaningful patterns. As the training began taking more and more time I split it out so that it could be performed by 8 virtual machines using amazon EC2. The results were then coalesced on my local machine.
The high point of my trading was October 2009 when I made almost 100k. After this I continued to spend the next four months trying to improve my program despite decreased profit each month. Unfortunately by this point I guess I’d implemented all my best ideas because nothing I tried seemed to help much.
With the frustration of not being able to make improvements and not having a sense of growth I began thinking about a new direction. I emailed 6 different high frequency trading firms to see if they’d be interested in purchasing my software and hiring me to work for them. Nobody replied. I had some new startup ideas I wanted to work on so I never followed up.
UPDATE - I posted this on Hacker News and it has gotten a lot of attention. I just want to say that I do not advocate anyone trying to do something like this themselves now. You would need a team of really smart people with a range of experiences to have any hope of competing. Even when I was doing this I believe it was very rare for individuals to achieve success (though I had heard of others.)
There is a comment at the top of the page that mentions ”manipulated statistics” and refers to me as a “retail investor” that quants would “gleefully pick off”. This is a rather unfortunate comment that’s simply not based in reality. Setting that aside there’s some interesting comments: http://news.ycombinator.com/item?id=4748624
UPDATE #2 - I’ve posted a follow-up FAQ that answers some common questions I’ve received from traders about this post.