We need to set up an automated build and testing service to be able to iterate faster and be able to track our improvements over time.
We need to clean up our testing harness, tighten up our test criteria, and make it easy to add new tests. We need to make our testing more scriptable, and support sharding across multiple testing servers as our test suite grows.
By making the functions in the codec smaller and better contained, we can improve readability and move towards unit testing.
There's room for improvement in the way data flows through the codec and certain things are structured, which should give a speed improvement if we clean them up. One example is removing unnecessary buffer copies on frames input to the encoder by using a reference counted buffer pool.
We need to continue optimizing VP8 for both desktop and embedded plaforms, like Atom and ARM. VP8 is pretty fast, but we can always make it faster.
Explore other approaches to multithreading. We suspect that there are starvation issues in our multithreading implementation that could improve performance if fixed.
Explore hardware level interfaces to acceleration blocks.
Look at GPGPU - offload encoding and decoding to the GPU where possible.