My Calling- To Grok

For Rick Schonfeld of the American Red Cross (and other random readers):

We were having a discussion about how we, the students in CRASAR, got into computer science and robotics. I said my reason was I had found my calling in CS in high school, and you asked what my calling was, specifically.

I said that it was system design- starting with a blank slate and a desired functionality and producing a working system. That’s a bit of a lie.

Designing systems is one of the things I enjoy, but developing a working system (something from nothing) isn’t quite why I like it. The actual reason is the understanding that comes with being a systems developer. I have a friend (and former coworker) that when asked how well he understood computers (part of a “what should be a students first language” debate) replied “I understand it down to the level of electrons moving… I stopped at why the electrons are moving, but above that I understand.”

Being a computer scientist teaches you how to think; it teaches you how to analyze a problem and how to come to a solution. Once you have a solution in mind, you learn to implement it. When you implement a solution, if you are worth your salt, you don’t just know how it works in vague and abstract terms, but rather you understand it. You grok it. You can watch it run and envision each individual component operating, running along at 2 billion operations per second (or merely 1 million depending on your platform). You can envision it flowing as one fluid entity, existing within its own reality.

Groking how something works is an amazing experience (similar to an ‘Ah ha!’ moment, but more awesome and less fleeting). Learning, understanding, and groking new and complex systems is why I am glad to be in Computer Science; and being exposed to new physical and electronic systems is why I am glad I am a Systems Developer.

It can also be quite terrifying. I designed and developed the software for a frac pump control system from scratch. I raised it from a wee tike that could barely blink LED’s to a system now deployed on million dollar industrial vehicles. I know that it works, and I grok it. But, every once in a while a problem comes up that doesn’t leave you wondering what happened, but rather exclaiming “That’s impossible.. that’s not how this works”. So far when I’ve run into these situations, it seems I’ve been right (odd mechanical or deployment issues), but each time your heart skips a few beats as you’re forced to question yourself, your understanding, and the basis of the reality your system is built around.

But, it is worth it.

Put a different way, a common joke is that real programmers use a magnetic pen to write data onto a hard disk. Somewhere out there are people who, given sufficiently steady hands and sufficiently small points to their pens, know enough about how a hard drive works that they could write data that way. There are people out there that, if you give them the raw electrical signals off your ethernet cable pairs, they can tell you what website you are surfing. These are those that grok.

Thanks to Rick for asking a question I probably should have known the answer to already.

Communication Issues in “The Challenge of the Multicores”

The Talk

A few weeks ago I listened in on a lecture given by Fran Allen, a Turing award winner and a language/compiler/optimization researcher at IBM.

The talk was on “The Challenge of the Multicores”. It meandered a bit into the history of 60’s era IBM machines, a few reflections, and a basic explanation of the problem being encountered with an emphasis on scientific computing. I think- the lecture was rather unfocused, and a few weeks ago, so memory=dead. Part of the lecture was a list of solutions to the multicore challenge:

  1. Cache needs to die a bloody and well deserved death
  2. We should move away from low level general purpose languages (C/C++/Java/Python/etc)

(Humorous tilt added by me, but otherwise this is roughly what she said)

I’m not sure if there were more, but these two stuck in my head because of what ensued. With the elimination of cache, you could hear almost every engineer in the audience groan in horror, and roughly similar may have occurred with the death of low level languages. The questions at the end of the seminar were roughly 90% “How do we handle the discrepancy between CPU speed and memory speed without cache”, “How could any low level general purpose language be completely displaced by a high level language”, etc. This was a classic issue of miscommunication.

The Death of Cache

With the first point, Fran isn’t saying “Get rid of the cache and have the CPU talk directly to memory” (or she might be, I’m not sure how well she understands hardware concerns). What she is saying is that how cache is implemented today is a large source of performance problems. Cache is operated largely without instruction from the program- it makes educated guesses about to load, and what to keep. When it guesses wrong, a cache miss occurs and a massive microsecond or so overhead occurs while memory is fetched.

Here is the problem: cache guessing is based largely on assembly code and current use. A program can’t tell the CPU ‘I’m going to use this data soon’; a compiler can’t provide meta data to the CPU of the form ‘by the time this method finishes, we’re going to need data from one of these three locations’. Lack of certainty about what the cache will do impedes automated optimization. That is what she was getting at. If instead we were able to treat the cache as 4MB worth of really slow registers, with it’s own lesser instruction set and controller, we could see massive performance gains as we can get much closer to a perfect cache with an intelligent cache controller then we could with a controller that only guesses based on current and previous state.

When phrased that way, cache death doesn’t seem like such an incredibly terrible idea. The cache is still there in some form, it’s just not the cache of today. There are still all manner of issues that would need to be worked out, but it’s alot more clear that “yes, you may have a point there”.

The Death of C

Again, she isn’t saying “C should be eradicated from everywhere and everything”, she’s saying that using a low level language removes the semantics of what you’re trying to do. The best example I can think of would be the difference between using SQL, and hand coding a query over a data structure. (Ignore small issues like ethics, responsibility, pain, etc).

If you encode a query yourself, you are responsible for parallelizing it, and doing that correctly. That means you need to keep in mind data dependencies, lock acquisition orders, etc. You may need to take into account niggly little details about the machine architecture you’re working on if you want massive performance. If you have a sufficiently complex query, it might be worthwhile to pre-compute certain parts of the query and reuse them. And so on.

With SQL, instead of specifying the steps to get to the state you want, you specify “I want everything that looks like x, cross referenced with everything that feels like y, ordered by z”. Your database server is then responsible for building a query execution plan. It asks “what can I precompute to make this faster”, “Do I need to do a linear scan through this entire data set, or do I have meta data somewhere such that I can do a binary search”, etc.

SQL constrains what you can ask for, but in return it exploits all manner of optimizations and parallelizations, without your lifting of a finger. Fran’s focus is on high performance scientific computing. If the people doing a weather simulation say “I need the solution to Ax=b” rather then coding a matrix inverse or an LU decomposition by hand, the compiler can provide a very efficient implementation, which may be able to exploit optimizations that the programmers could not see. She isn’t saying “Throw out C#, write your web applications in MATLAB”; but she is saying that current practice makes it hard for optimization researchers to help your code run faster.

Conclusion

Communicate better. It’s better for you, your message, and us.

AI Project- Sudoku Through Constraint Satisfaction

One of the final projects in the AI class is to teach middle schoolers about an aspect of AI. I chose to teach about the fun and joy of constraint satisfaction through Sudoku. Why constraint satisfaction? Because that is one of the first AI topics I came across in the class and thought ‘Wow, this is awesome… and I haven’t seen it twenty times before’. It is a solid method for solving a specific class of problems.

To see it in action, look here. Source code for everything (from the simple (and terrible) php webpage, to the Constraint Satisfaction Solvers, is available here.

I’ll warn though that the code isn’t completely clean. A fair amount of the code in the various solvers could have been implemented… differently. Such that it wouldn’t be a copy and pasting between the ~5 solver implementations.

Also, the way possible values are determined is a horrible kludge, that I should have known better then to do. Part of the goal of the Solver implementations is to make the intermediate states visible. To make that simpler, no actual recursion is used in the course of searching. Since no recursion is being used, it’s harder to make changes to intermediate state, such as ‘we need remove values 5 and 6 from the possible values for cell 15’, and then undo those changes. Being lazy, I opted to just generate the possible values for all cells on each step of the algorithm. Terribly inefficient, but, honestly, that’s not a problem for this. Drawing out the entire board on screen probably takes much more time. What was unfortunate was when I went to try and encode Forward Checking / Arc Consistency, I needed the Variables to store their possible values in order for the code to be not terrible. As such, Arc Consistency turned out to be rather messy, and surprisingly useless on the small Sudoku grid (or I did it wrong; but surely that is impossible), so I scrapped it and moved on. But, overall, the implementation would be more robust if the variables were responsible for storing their individual values. And if I didn’t need those intermediate states.

In conclusion, it was fun, I learned things, hopefully middle schoolers will also learn things, or at least lie and say they did, and I now have other finals I need to worry about.

Special thanks, again, to Peter Norvig’s Solving Every Sudoku Puzzle. It provided for entertaining quotes and thoughts, and the inclusion of a 62 digit number in the final website.