Broad Update

Yes, this shall be a blog updated once every few months to say ‘I should write more’. Whatever.

ACM and AI
Starting with the current purpose of this blog, the ACM Local competition was last week; and I successfully succeeded at succeeding. First place by one problem. I went to the last two ACM contests with one of my teammates, and the other one I have a fair amount of confidence in. We shall see what happens. But, as a humorous aside to that: I very nearly implemented a Sudoku solver in under 30 minutes. Had I stopped asking myself ‘is it worth it to put in this much effort into a problem you probably won’t solve before the contest is over’; I probably could have solved it. Probably not, but maybe. I solved it using a horribly hacked together Constraint Satisfaction based Solver. For that interesting 30-50 minutes of my life, I would like to thank Robin Murphy (AI Prof), and Peter Norvig (Algorithms God/AI Book Author); both of which suggested Sudoku solving as an application of CSP. I’ll probably clean it up later and post it… as it stands currently it makes me want to cry, but it theoretically should work. (Theoretically because the test data was, in fact, invalid).

Education- Dr. Tao?
On a different note, I think I’m going to go for a PhD. This conclusion was got to by an interesting failure in my ability to communicate. As part of a follow up email, I had written the line ‘assume I go for a PhD’, which led to the thought ‘If I graduate and go into industry with a Bach, I’m going to end up drifting from job to job with mounds of uncertainty and stress’ (drifting coming from the popular consensus that a programmer should expect to change jobs every 2-3 years). If I get a PhD, there’s a lot less uncertainty in terms of job (at least I hope), I have the opportunity to interact with really incredibly frustratingly annoyingly intelligent folk, and if I go the academia route, I can torture a class of Yankee’s with ‘Howdy Class’. How can you not like that?

As a side note, I suspect the reason I was never comfortable with the thought of Grad School before is because I had never really come across something that peaked my interests. There’s a popular quote by Dijkstra: “Computer Science is no more about Computers then Astronomy is about Telescopes”. I have this odd sense that it is often misinterpreted from it’s original context. Computer Science is about the mathematical basis behind computing: proving algorithms correct, classifying problems as NP-hard, finding the most efficient way to find a correct solution, all about the theory. To me, that is what Computer Science Grad School is/was about. And I hate that definition. Mostly because it feels like something from the Ivory Tower- occasionally interesting results with only limited real world applicability. I want to generate solutions to real problems. Is making a formally proven OS kernel useful? Maybe to some, but spending 5 years to develop 10,000 lines of proven code just feels like a waste. I’ll probably regret those statements later, but, such is life.

Further side note: one thing I’ve heard fairly commonly is that in general, you don’t recoup the cost of a PhD. A master’s is worth it, but a PhD won’t earn you more over a lifetime. My response: Who cares? If you’re making a career or education choice based soley on ‘will it pay off in the end’, then you’re doing something wrong. Education (and career) should be driven off personal interest, and as an auxiliary, will it earn enough money such that you won’t live in a cardboard box. Going into advanced education because it will buy your retirement home a heated pool is just plain wrong.

Shiny New Toy- Arduino
Yet another different tact, I have a shiny new toy. An Arduino. With a product name which I can neither remember, pronounce, nor spell. I can at least remember/spell Arduino… pronunciation is still off. I blame the Italians. I got it in part because I wanted to get involved in the electronics/embedded side of computing, and mostly because it provided a convenient way to solve a problem my brother was having.

My brother, through what I assume were family connections, was hired to fix an outdoor LED sign for a bank. The company that manufactured the sign had gone out of business, but he was able to get some schematics for how the individual boards in the sign were wired. While he could take the boards down, ohm out connections, and so on, he didn’t have a convenient way of actually testing that either the repaired sign worked, or finding where issues might lie in malfunctioning boards.

I psuedo-come-in at this point. He asks me if I can write a program that would be able to test the boards. The boards themselves are driven off 4 inputs- 2 serial lines, a clock, and a latch. In the sign they’re driven off of what is probably a parallel port on a 386 running windows 98. I’m not sure if you’ve noticed, future reader, but most computers these days don’t come with parallel ports. So I go through the thought process of ‘how could I go about this’; serial ports don’t really work, as at most you might be able to control voltage on two lines; we could go through the trouble of finding and installing a parallel port expansion on a testing computer, and try to write a program that would drive the port/board. And then I thought “wasn’t there something Kyle mentioned not long ago; an Arduino?”.

It fits the bill perfectly. So the short ending to this is: the first day I got it, I broke it out, did the standard “Make an LED flash” thing to see if everything was working; and the second day I built a controller for the sign boards. A bit later I added some simple configuration bits, a button to switch between testing modes, and hooked up the 9V battery so the computer became completely unnecessary. My brother has since stolen it for testing, until he can get his own.

So, I feel good about that. One thing I’ll probably post soon is a list of projects to go along with it. Y’Know, to go along with the other list of projects I’m not making progress on. I might also post more details about the sign, as it was a fun first project/jaunt into Computer Engineering.

Deutschland
I’m also planning to go to Bonn next Summer as part of a study abroad program. I’m trying to learn some German for that; and it’s coming along slowly. And that’s it. That’s all. See y’all later.

ACM 2008 SCUSA Wormholes- 2nd Edition

Last time we left our time travelling hero’s, they were using a repeated applications of the Bellman-Ford algorithm (at time complexity O(|V| * |E|), where |E| is on par with |V|, and number of requests is probably also on the order of |V|, giving a total of O(|V|^3)). Our valiant programmer desperately seeks to apply an the Floyd-Warshall algorithm to the problem, just so that the time spent reading the analysis of algorithms book isn’t wasted.

The secret to successfully applying the Floyd-Warshall algorithm to this problem is the same secret to getting reasonable performance out of the previous solution. You can’t look at each individual year, you instead must look at time spans between meaningful events. That brings the number of vertices down from many thousand to at most 301 (starting year + 200 distinct wormhole years + 100 mission years). It has the downside of making the process of finding weights a fair bit more complex:

public int[,] BuildInitialWeights(List<int> distinctVerticies)
{
    int[,] weights = new int[distinctVerticies.Count, distinctVerticies.Count];
    for(int i=0; i<distinctVerticies.Count; i++)
        for(int j=0; j<distinctVerticies.Count; j++)
        {
            if (i == j) weights[i,j] = 0;
            else weights[i,j] = Int32.MaxValue;
        }
 
    //assumes vertices are in order, from first year to last
    for (int i = 1; i < distinctVerticies.Count; i++)
    {
        int start = distinctVerticies[i - 1];
        int end = distinctVerticies[i];
        weights[i-1, i] = end - start;
    }
    foreach (int[] years in wormholes)
    {
        int i = distinctVerticies.IndexOf(years[0]);
        int j = distinctVerticies.IndexOf(years[1]);
        int weight;
        if (years[1] > years[0]) weight = (years[1] - years[0]) / 2;
        else weight = (years[1] - years[0]) / 4;
        weights[i, j] = weight;
    }
    return weights;
}

Here we’re building a 2D matrix of weights. Informally we’re mapping each year in the distinct vertices list to its index, then using that index as coordinates in the weight matrix. Weights are constructed by first setting all weights to their default values (0 to get to current node, max value for other nodes). Then default weights for moving forward with time are assigned. Finally, wormholes are accounted for. Wormhole accounting is done last as it will be, by definition, the shortest path between two points. This has the advantage that only one path is ever considered between two points, where as the previous solution allowed for multiple paths.

public List<int> BuildDistinctVerticies(List<int> addedYears)
{
    List<int> years = new List<int>(addedYears);
    foreach (EndPoint wormhole in wormholes)
    {
        years.Add(wormhole.Start);
        years.Add(wormhole.End);
    }
    years.Sort();
    for (int i = 1; i < years.Count; i++)
        if (years[i - 1] == years[i])
            years.RemoveAt(i--);
    return years;
}

This is the simple method for building all distinct vertices. It starts by taking all mission and starting years, adds each end point from a wormhole to the list, then sorts it and removes duplicates. Again we get into ‘it would be nice to have a Sorted Set’ (Should appear in .NET 4; or, alternatively, has always been in Java… accursed Java with its nicer Collections libraries…).

public int[,] AllPairs(List<int> missionYears, out List<int> distinctVerticies)
{
    distinctVerticies = BuildDistinctVerticies(missionYears);
    int[,] currentWeights = BuildInitialWeights(distinctVerticies);
    int[,] nextWeights = new int[distinctVerticies.Count, distinctVerticies.Count];
    for (int i = 0; i < distinctVerticies.Count; i++)
    {
        Extend(currentWeights, nextWeights, i);
        int[,] temp = currentWeights;
        currentWeights = nextWeights;
        nextWeights = temp;
    }
    return currentWeights;
}
 
public void Extend(int[,] previosLevel, int[,] thisLevel, int k)
{
    for(int i=0; i<previosLevel.GetLength(0); i++)
        for(int j=0; j<previosLevel.GetLength(1); j++)
        {
            int iToJ = previosLevel[i,j];
            int iToK = previosLevel[i,k];
            int kToJ = previosLevel[k,j];
            if (iToK != Int32.MaxValue && kToJ != Int32.MaxValue)
                thisLevel[i,j] = Math.Min(iToJ, iToK + kToJ);
            else thisLevel[i,j] = iToJ;
        }
}

I’m about 90-100% certain I misspelled previous. Ignoring that, here we can see Floyd-Warshall algorithm implemented. I don’t fully understand how it works. I’ve read that the ‘extension’ element is simulating whether or not the shortest path from i to j will go through k; but at this hour of the morning I still can’t quite wrap my head around how it works. Something to learn for the new school year.

The distinct verticies are passed as an out parameter to the caller, so that they have the map from index to year available to them.

List<int> years = new List<int>(missions);
years.Add(startingYear);
int[,] distances = timeline.AllPairs(years, out years);
 
foreach(int destination in missions)
{
    int i = years.IndexOf(startingYear);
    int j = years.IndexOf(destination);
    int toDistance = distances[i, j];
    int fromDistance = distances[j, i];
    if(toDistance==Int32.MaxValue || fromDistance==Int32.MaxValue)
        Console.WriteLine("IMPOSSIBLE");
    else
        Console.WriteLine(toDistance+fromDistance);
}

Finally, the relevant changes to the problem solving method. The down side to using this approach is that there isn’t a direct mapping from year to index; so a linear search is performed to get that mapping. It’s not really an issue, considering what else is being done, but it is a bit unfortunate.

Analysis
Psuedo-analysis might be more accurate. Comparing this to the previous solution, I like the fact that the algorithm seems more straight forward, and that in the end we end up with a massive table of possible paths, should our time travelling agents ever need to do strange other things. It also has a slight performance advantage over the straight version from last time. (It’s about on par with not continuing the Bellman Ford algorithm if no relaxations occur, though that might be in part due to poor implementation of that algorithm).

That said, in its current form, it adds about 30 lines of code, and the indirect mapping from index to representative year feels ungood. In the end, the question for ACM problems is ‘which could you code fastest with the lowest probability of error’, and looking at the two, I think the Bellman Ford would be a bit easier, specifically because it uses a series of edges, rather then a weight table. I could be wrong though, and the performance level does seem to be in favor of Floyd Warshall.

Side Rant
I seem to be rather poor at a very specific bit of contestifying. That is seeing points where you can discard unimportant information. In this problem, it was my attempt to treat every year as a vertex. In a problem about a disk defragmenter, I was tempted to treat the disk as a series of sectors rather then a series of filled and unfilled ranges (would have massively increased computation time to simulate, and probably over complicated the code; didn’t get to find out due to contest ending). I find that this seems like a common theme to me: I constantly attempt the simplest solution that could possibly work, and these problems are more often then not tuned to make that ‘the wrong way’. I suspect this comes from my many years of employment, where you’re not necessarily seeking the best solution, but rather just something that works well enough to get by. That is, of course, part of why I started this crazy blog: to get better at recognizing when I’m going down an incredibly wrong path.

Full Source Code for this solution.

ACM 2008 SCUSA Wormholes

South Central USA has to be one of the odder acronyms in common use… a single character away from SCUBA no less.

The last time I went to ACM, this problem was delegated to one of my team mates, and for one reason or another we never got it solved. I was reading through an Analysis of Algorithms book and came across a section on All Pairs Shortest Path and thought ‘This would apply perfectly and elegantly to this situation’. (I was wrong, but we’ll get to that soon). (Also, judge data can be had at this location, and I have been cheating miserably).

Synopsis
You work at Time Travel Inc. in the payroll department. Sick of keeping up with all the math related to time travel, you contracted yourself (while in college) to develop an application to determine how much a person has aged during their time travel, and whether or not their last journey was even physically possible. More specifically, time travel is acheived by portals which appear at specific moments in time, and exit at other specific moments, either aging or youthifying the human in the process, according to a specific formula; and ignoring the concept that you could die, or be negative years old.

Analysis
This is a graph traversal problem with negative weight directed edges. The goal is to find the minimum weight path that takes you from a start, to a destination, and back to a start; or equivelantly, two graph traversals.

As I said, I had read a bit on All Pairs Shortests Paths and was interested to see how it would handle this. Specifically, I was thinking ‘if you treat each year as a vertex, then each vertex has a weight one edge to the next year; and if a wormhole is present, an edge to that destination year. At this point in time, I thought the input was constricted to a few hundred years near the 2000’s (where else would you time travel to?). The actual range of years is from 1-10,000. 10,000 verticies, 2 10,000 x 10,000 matricies for storing path weights… no. Not gonna work. (As we’ll see in a bit, this is actually a bit of a lie).

Still curious to further my learning, I read up on the Bellman-Ford algorithm, which is a single source shortest paths algorithm. It works on the assumption that the absolute longest path will go through every single vertex in V, and therefore you can determine the minimum cost to get anywhere by going through each edge, relaxing the edge’s destination weight if appropriate, and then repeating that |V| times. Complexity of |V|*|E|. That actually sort of worked, in the sense that it produced an answer for most problems in under a minute. But, under a minute isn’t quite ideal.

To get reasonable response times, you can’t simulate every single meaningless year. You have to focus on only the meaningful ones, and declare edges between them. If you’re in 1950 and want to get to 2050, yes, you can look at it as 100 1 year edges, but it would be less pain to look at it as 1 100 year edge… or, if a wormhole happens to open up in 2012, a 62 year edge followed by a 38 year edge. The downside to ignoring meaningless years is this adds a fair bit of complex non intuitive code, just for the purpose of filling in the ‘time moves forward at a constant rate’ bits.

With that said, you could apply the same reasoning to all pairs shortest path, so long as you only kept track of the vertexes you actually cared about.

Source Code
As this one is a tad bit long, and far from being the greatest, I’m only going to post bits and peices of it.

public static void ProcessProblem(int id, TextReader input)
{
	Console.WriteLine("DATA SET #" + id);
	Timeline timeline = new Timeline();
	int wormHoleCount = ReadInt(input);
	for (int i = 0; i < wormHoleCount; i++)
	timeline.AddWormhole(ReadInts(input));
	int startingYear = ReadInt(input);
	int missionCount = ReadInt(input);
	for (int m = 0; m < missionCount; m++)
	{
		int destination = ReadInt(input);
		int distance = timeline.ShortestPath(startingYear, destination);
		int returnDistance = timeline.ShortestPath(destination, startingYear);
		if(distance==Int32.MaxValue || returnDistance==Int32.MaxValue)
			Console.WriteLine("IMPOSSIBLE");
		else
			Console.WriteLine(distance+returnDistance);
	}
}

Above is the primary problem solving bit. We create a timeline, register wormholes, then go through each mission and determine if the destination can be reached from the starting point, and if the starting point can be reached from the destination. Note that we’re using the max value of an int as a ‘no path’ return code.

class Timeline
{
	private List<int[]> wormholes=new List<int[]>();
 
	public void AddWormhole(int[] wormhole)
	{
		wormholes.Add(wormhole);
	}
 
	public int ShortestPath(int sourceYear, int destYear)
	{
		List<Edge> edges = new List<Edge>();
		List<int> vertexes=new List<int>();
		vertexes.Add(sourceYear); vertexes.Add(destYear);
		foreach (int[] wormhole in wormholes)
		{
			vertexes.Add(wormhole[0]);
			vertexes.Add(wormhole[1]);
			edges.Add(new Edge(wormhole[0], wormhole[1], true));
		}
		vertexes.Sort();
		for (int i = 1; i < vertexes.Count; i++)
			if (vertexes[i] == vertexes[i - 1])
				vertexes.RemoveAt(i--);
		for (int i = 1; i < vertexes.Count; i++)
			edges.Add(new Edge(vertexes[i - 1], vertexes[i], false));
 
		Dictionary<int, int> distances = new Dictionary<int, int>();
		distances[sourceYear] = 0;
 
		for (int i = 0; i < vertexes.Count; i++ )
		foreach (Edge edge in edges)
		{
			if (!distances.ContainsKey(edge.start)) continue;
 
			int distance = distances[edge.start] + edge.weight;
			if (!distances.ContainsKey(edge.end) || distances[edge.end] > distance)
				distances[edge.end] = distance;
		}
		if (distances.ContainsKey(destYear))
			return distances[destYear];
		return Int32.MaxValue;
	}
}

And here we have the meat of the problem. The graph traversal algorithm/timeline class. The first portion of the ShortestPath method is soley responsible for building a list of edges (note, third parameter to edge constructor is ‘isWormhole’). Once that is constructed, the Bellman Ford algorithm is repeatedly applied to get the distance from the start point to every other point. Bellman Ford performance can actually be improved slightly by not continuing to iterate if no relaxations were needed on a given iteration. On my machine, execution time went from about 3s to .5s. (Also, as a fun aside, note the sad lack of a .NET Set class).

That’s all for now. I think I’ll reimplement this with an all pairs shortest path just to see if it would meet time constraints on execution; and make a post about that later. Also, I’m planning to try and deploy/install Wordpress, with some form of auto syntax highlighting magic included, to make this blog thing a bit easier. We’ll see how that goes. And, for a third thought, I finally got started on LC3D, after mere months. I’m working on the GUI now, getting camera to work right, everything rendering, etc. Then I can start in on the fun that is integrating EVERYTHING. Yay.

Full Source Code for First Solution.

Euler 038- Find the largest 1..9 pandigital

Project Euler #038

A simple problem for today. A 1..9 Pandigital is defined as a nine digit number that contains digits 1 through 9. Find the largest pandigital that can be formed by concatenated multiplications. Specifically, given a number (192), multiply by, say, 1, 2, and 3 to get 192, 384, 576; the concatenated multiplication would be 192384576; which happens to be a pandigital. Multiplication factors have to be increasing, (1,2; 1,2,3; etc); and can’t be only 1.

So, my simple approach involves trying number of factors (n) of 2 through 8, and iterating through all possible factors that should be roughly less then a 9 digit number. Roughly/maybe/sort of. Overall the code isn’t that complex.

At least one optimization exists in realizing that the first digit of the factor has to be a 9, limiting the number of numbers to try; though in this case iterating over all possibilities is fast enough that that doesn’t really factor in.

public static void Main()
{
    //the only pangdigital with n=9 is 1
    //n>1 is required
    List<string> panDigitals = new List<string>();
    for (int n = 2; n < 9; n++)
        panDigitals.AddRange(FindPandigitals(n));
    panDigitals.Sort();
    foreach (string pan in panDigitals) Console.WriteLine(pan);
}
 
public static List<string> FindPandigitals(int n)
{
    List<string> results = new List<string>();
    int digitCount=(9/n)+1;
    int maxI = (int)Math.Pow(10, digitCount);
    for(int i=1; i<maxI; i++)
    {
        string s="";
        for (int a = 1; a <= n; a++)
            s += i * a;
        if (IsPandigital(s)) results.Add(s);
    }
    return results;
}
 
 
public static bool IsPandigital(string s)
{
    if (s.Length != 9) return false;
    for (char a = '1'; a <= '9'; a++)
        if (!s.Contains(a))
            return false;
    return true;
}

Side note: I suspect next I’m going to try an ACM problem… or maybe something from Google CodeJam… who knows?

Euler 208- Robot Walk

http://projecteuler.net/index.php?section=forum&id;=208 -

This is a fun one, which I approached in a way which can be classified as ‘entirely more painful then it had to be’. I should probably add a disclaimer to these posts “Not neccessarily a good way to solve the problem”; but alas, the solution is here irregaurdless of its suitability.

In this problem we have a robot; and we want to know, given that it will move 70 units, turning left or right each time (at 72 degrees, or one fifth a circular arc), how many unique paths will end back at the origin. I thought about it off and on for a few weeks. The immediate thing that jumps out are there are 2^70 distinct paths. Clearly immune to normal brute forcing.

First odd thoughts were that there might be something in the realm of combinatorics… I experimented with establishing if there was an obvious tie in between number of left/right turns and solutions; I couldn’t’ find one. I made the discovery that there should be equal number of end points facing all directions: (70 end points, so 70/5=14 points should face in each direction, 0 degrees, 72, 144, 216, 288, 360=0). But, I couldn’t figure out how to combine this knowledge with the fact that only two possible values can occur after a specific value (e.g. you can’t go from facing 0 to facing 216), while also maintaining that each end point needs to occur equally. Also, I’m sorry for the fictional people that read that last paragraph. I didn’t even understand it. All that said, there actually is an odd combination of combinatorics and dynamic programming which I didn’t find; I’ll chat about it after I give my own terrible solution.

So, we already discussed that we can’t brute force it. We can take a different approach though. While there are 2^70 distinct paths, there are considerably fewer unique ending points to paths. As an example, after 5 movements, there are two paths that end at the origin; after 10, there are at least 4 (and possibly more); similar occurs for several other end points.

So, my approach is to build a list of ending points (direction, and ending x and y coordinates, assuming start from origin), and the number of paths that end there. When we have that, we can combine any two paths by rotating the second path according to the first paths direction, and we know how many paths will end there: number of the first path * number of second path. The end result is we save alot of duplication of calculation. That said, this is by no means a fast solution.

We’ll start with the definition of a Location:

class Location
{
	public double ActualX;
	public double ActualY;
	public int CloseX;
	public int CloseY;
	public int Rotation;
	public Location(double x, double y, int rotation)
	{
		ActualX = x;
		ActualY = y;
		CloseX = (int)(100 * x);
		CloseY = (int)(100 * y);
		Rotation = rotation % 360;
		if (Rotation < 0)
		Rotation += 360;
	}
 
	public static Location operator +(Location left, Location right)
	{
		double angle=left.Rotation*Math.PI/180;
		double co=Math.Cos(angle);
		double si=Math.Sin(angle);
		double rotatedX = right.ActualX * co - right.ActualY * si;
		double rotatedY = right.ActualX * si + right.ActualY * co;
		return new Location(left.ActualX + rotatedX, left.ActualY + rotatedY, left.Rotation + right.Rotation);
	}
 
	public override bool Equals(object obj)
	{
		Location loc = (Location)obj;
		return loc.CloseX == CloseX && loc.CloseY == CloseY && loc.Rotation == Rotation;
	}
 
	public override int GetHashCode()
	{
		return ((Rotation/72) << 28) + (CloseX << 14) + CloseY;
	}
}

We see a few things about this. First, we’ve set it up to work in a Dictionary (as HashMaps are the most useful ADT ever created) by writing the GetHashCode and Equals methods; and we performed a minor ‘optimization’ of these, by using an integer approximation of the actual floating point location. This also solves a common problem in that checking equality on doubles is often spotty at best: 1.0000000001 != 1.0000000002. Basically, because of how we chose the CloseX and CloseY values, we assume that no two paths will get within 1/100th of a unit distance from each other, and not actually be at the same point. Experimentally, this results in the same number of distinct paths as using a 10,000 multiplier; so it probably works.

This can also sort of be verified mathamatically: because there are only 5 basic movements, the number of areas that can be visited is much decreased: you can’t visit every point on the plane using this robot. Or, somewhat closer to math terms: the movements that can be made do not form a Basis for the plane. (I’m fairly sure that’s a terrible butchering of Linear Algebra terminology; but it works for me). This is mostly because it can’t move at arbitrary factors of the directions it can move in: ie it can’t reverse, it can’t move pi steps in a given direction, and so on.

With those boring parts explained, we also have the plus operator. Given two Locations, it returns the Location that would be formed by following the path that lead to the first Location; then following the path that lead to the second Location given the new starting orientation and position. Basically, what it’s doing is rotating the second point around the origin, then adding that to the first point, and recalculating the rotation. Fairly simple. The rotation is stolen from Linear Algebra, and more specifically, I beleive I stole it from the wikipedia article on Rotation. (Sad fact: College Linear Algebra spent about 5 minutes on the topic, and never came up again).

With our Location defined, let’s look at the algorithms/code:

class E208
{
	private static Dictionary<int, Dictionary<Location, long>> paths;
	public static void Main()
	{
		paths = new Dictionary<int, Dictionary<Location, long>>();
		GenerateStartingPaths();
		GeneratePaths(70);
		Console.WriteLine("Count of 0's: "+paths[70][new Location(0, 0, 0)]);
		GeneratePaths(25);
		Console.WriteLine("Count of 0's: " + paths[25][new Location(0, 0, 0)]);
	}
 
	private static void GenerateStartingPaths()
	{
		paths[1] = new Dictionary<Location, long>();
		double x = Math.Cos(72 * Math.PI / 180);
		double y = Math.Sin(72 * Math.PI / 180);
		paths[1][new Location(x, y, 72)] = 1;
		paths[1][new Location(x, -y, 4*72)] = 1;
		Combine(1, 1); //generate length 2 paths
	}
 
	private static void GeneratePaths(int length)
	{
		while(!paths.ContainsKey(length))
		{
			int maxKey = 1;
			foreach (int key in paths.Keys) if (key < length) maxKey = Math.Max(maxKey, key);
			int otherKey=Math.Min(length-maxKey,2);
			Combine(maxKey, otherKey);
		}
	}
 
	private static void Combine(int firstKey, int secondKey)
	{
		Dictionary<Location, long> first = paths[firstKey];
		Dictionary<Location, long> second = paths[secondKey];
		Console.WriteLine("Combining " + firstKey + " " + secondKey+" with "+first.Count+" and "+second.Count+" paths");
		Dictionary<Location, long> result = new Dictionary<Location, long>(2*first.Count);
		foreach (KeyValuePair<Location, long> firstKvp in first)
		foreach (KeyValuePair<Location, long> secondKvp in second)
		{
			Location newPoint = firstKvp.Key + secondKvp.Key;
			long count = firstKvp.Value * secondKvp.Value;
			long currentValue;
			if (result.TryGetValue(newPoint, out currentValue))
				result[newPoint] = currentValue + count;
			else result[newPoint] = count;
		}
		paths[firstKey + secondKey] = result;
	}
}

So, we have a few large things here. First note that we keep track of all paths of a given length that were created. So, in that paths Dictionary, we have a list of all paths of length 1, length 2, length 4 … length 70. With that out of the way, let’s get started.

First thing we’ll chat about is combining. Combining takes two sets of paths of given lengths, and then finds all combinations of those two sets as a result set. This gives you all paths of length firstLength + secondLength. It does this by going through each path in the first set, and adding to it each path in the second set. This gives us a new location, and we know that the number of ways to get to that new location are number of ways in first set * number of ways in second set.

Not the cleanest of explanations I understand. A better way to think about it might be to say ‘if I wanted to get to Location X + Y, I could take any path from this first set of paths that take me to X, then any path in the second set of paths that take me to Y. I can thus take number of first paths * number of second paths unique routes.’ Basically.

Moving on from my terrible explanation attempts, we’ll look at GeneratePaths. This has an interesting story to it. When I first wrote this, I would go through, and select the largest two paths that would sum to near the desired length. It made sense to me: take this route, you perform only log n combine operations, and everyone ends up happy. With this approach, calculating paths for length 70 took ~280 seconds. And I spent a while trying to improve this to no avail. As it turns out, this would happen because combining a 32 length path with a 32 length path requires 18,000*18,000 = ~350,000,000 sets of operations (a set of operations being dictionary lookups, cos/sin, memmory allocations, math). What I found was, that by doing only paths of length 2, we decrease the number of operations needed to get to path length 70 by about a factor of 100. More specifically, path length 68 had 324,000 unique paths; if we assume all paths (except length 2, having 4 elements) had 324,000 unique elements, and we needed perform 68/2 Combinations, we would need to perform only 324,000*68/2*4 = 44,000,000 operations. It’s actually a fair bit less, I’d say on the order of 20,000,000. This leads to a total calculation time of only about 8 seconds. Very nice.

As to what it’s actually doing, it goes through, finds the largest key smaller then the length we’re looking for, then tries to combine it with the paths of length 2, or paths of length 1 if it happens that the desired length and max key are only different by a length of 1.

GenerateStartingPaths has the first two paths hard coded, and then goes on to generate paths of length 2, to make the code for generating arbitrary length paths a bit nicer.

Main is fairly obvious: Generates the starting paths, and calls for paths of length 70, then looks up the value of Location 0,0; facing North. Nice and simple. Sort of. This post makes me question the economics of >100 line solutions in a blog format. But, it was fun.

Now, for that nice combinatoric/DP solution.

static Dictionary<long, long> values=new Dictionary<long,long>();
 
public static void Main()
{
	Console.WriteLine(Calculate(5, 5, 5, 5, 5));
	Console.WriteLine(Calculate(14, 14, 14, 14, 14));
}
 
static long Calculate(long n1, long n2, long n3, long n4, long n5)
{
	if (n1 < 0 || n2 < 0 || n3 < 0 || n4 < 0 || n5 < 0) return 0;
	if ((n1 + n2 + n3 + n4 + n5) == 0) return 1;
	long key = n1 + 15 * (n2 + 15 * (n3 + 15 * (n4 + 15 * n5)));
	if (values.ContainsKey(key))
		return values[key];
	long result = Calculate(n2, n3, n4, n5, n1-1) +
	Calculate(n5-1, n1, n2, n3, n4);
	values[key] = result;
	return result;
}

It executes in under a second (and is based off code by ‘hirak99′ in the Project Euler forums). I almost/barely understand what it is doing. First, consider each movement as a move in x,y; rather then having a rotation component, and other complexities. If you look at it that way, then you need equal number of moves in each of 5 directions to get back to the origin. But, you can only move in one of two directions when at a specific location; this algorithm accounts for that by keeping track of how many moves we’ve made, and rotating them arround to account for ‘I’m turning left’ or ‘I’m turning right’.

As a slightly better (or worse) example, envision you start at the origin, turn left, then turn right. You were at 0,0; you moves to a point at cos(-72),sin(72), then move again to in that same basic direction (actually you move at cos(0),sin(0) rotated 72 degrees). You end up at the point 2x,2y, as the algorithm dictates. Which was the hardest bit for me to understand about it. And I still couldn’t explain this to (non-tech/math relative)… but I found it cool none-the-less.

Euler 078- Find First Number With Count of Unique Sums Divisible by 1,000,000

http://projecteuler.net/index.php?section=problems&id;=78

I actually solved this better then a week ago, but never got arround to the writeup. I blame Jae, and my inability to schedule time.

Technically the topic description is a lie/simplification. The problem is actually this: given p(n) defines the number of unique ways to form groups from n items (ie for 3 items you can have 3 in 1 group, 2 in 1 and 1 in 1, or all 3 in seperate groups); find the value of n for which p(n) % 1000000 = 0. This is basically the same concept from Euler 76, if I recall it correctly, where you’re looking for the number of unique sums for a given integer n: this is the exact same problem in that sense.

However, the problem itself is entirely different, and much more painful.

The first item that I recall comming across is that p(n) doesn’t fit into a 32 bit integer for the value that we’re looking for; and it doesn’t even fit in a 64 bit integer (though I don’t recall if I had figured that out the fun and exciting way before the next bit). To get arround this I decided to play arround a bit with ruby, and it’s arbitrary precision integer fun. As it turns out, Ruby is dog slow once you get into these 20+ digit simple additions; and that also got into the next issues.

Memory usage is massive for the 2D array approach to handling this. I vaguely recall quiting the application after it had gotten to 600~mb of memory usage and in either the low 20,000’s or the low 2,000’s of n. Getting a solution to this was actually relatively simple, and slighlty stolen from the discussion to the solution of problem 76. There exists a recurrence relation that defines p(n) in terms of p(~1)…p(n-1): (loosely stolen from http://mathworld.wolfram.com/PartitionFunctionP.html)

P(n)=sum(1..n on k, (-1)^(k+1) * [P(n- (3k^2-k)/2 + P(n- (3k^2+k)/2]
with P(0)=1, and P(<0)=0.

This speeds up calculation a fair bit, and cuts memory usage by… alot. (should be by a factor of n / it should be a sqrt of the original usage).

That said, calculation still took quite a bit of time, (mostly due to simple 20+ digit arthmetic, again), so I wondered if there was a better solution. And thus stole from the Project Euler forums yet agin. The key to the next step is that we don’t actually care what the value of p(n) is, just that p(n)%1000000=0 . This means that the array we use for storing p(n) only needs to concern itself with values between 0 and 999,999; meaning we can go back to C# and it’s 32bit primitive data types.

One fun thing that arose from only keeping track of a part of p(n) is that the strict / straightforward calculation of p(n) is no longer gauranteed to be positive; and I spent a fun few times trying to figure out why it would fail on me in interesting ways. But, eventually it worked… eventually.

Calculation after all these changes was a fair bit faster; and if I recall correctly, the answer was in the 50,000 range of n; a fair bit sooner then I would have expected.

Code:

public static void Main()
{
    for(int i=5;;i++)
    {
        if (NumberOfUniqueSums(i) == 0)
            Console.Write(i + "-*-*-" + NumberOfUniqueSums(i)+" ");
        //something to pass the time
        if (i % 1000 == 0)
            Console.Write(i + " ");
    }
}
 
//storage for previous lookups of P(n)
static Dictionary<int, int> sumResults = new Dictionary<int, int>();
static int NumberOfUniqueSums(int numToSum)
{
    if(numToSum < 0) return 0;
    if(numToSum==0) return 1;
    if(sumResults.ContainsKey(numToSum))
        return sumResults[numToSum];
 
    int sum=0;
    //past the sqrt, you're always looking at P(-x); always 0
    for(int k=1; k<=Math.Sqrt(numToSum)+1 && k<=numToSum; k++)
    {
        int n1 = numToSum - (k * (3 * k - 1) / 2);
        int n2 = numToSum - (k * (3 * k + 1) / 2);
        int tempSum = NumberOfUniqueSums(n1) + NumberOfUniqueSums(n2);
        int factor = (k % 2 == 1) ? 1 : -1;
        sum += factor * tempSum;
        sum %= 1000000;
    }
    sumResults[numToSum]=sum;
    return sum;
}

Euler 076- Number of Ways to write 100 as a sum

From http://projecteuler.net/index.php?section=problems&id;=76 -

Find the number of ways to write 100 as a unique sum of at least two (positive) numbers. This is actually rather similar to Euler 077, in that the same basic approach can be applied… except, brute force won’t work, directly. Going through and doing a blind recursive determination of all possible sum leads down the path of ruin, and waiting until the apocolypse. Or at least I assume so; I didn’t wait arround to find out just how slow it was.

A *slightly* better way to solve this is using a dynamic programming approach. Brute forcing in this case is problematic because it retreads and re-solves alot of subproblems. Say we wanted to solve for 5. To solve for 5 using our current algorithm, we hit ‘how many ways can you sum 1 using 4 through 1′ several times. (Haven’t actually counted it out, but take my word for it). When we try it with 100, we spend alot more time at that last level: the answer is on the order of 2*10^9; that would be about 10^9/100=10^7 times we retread the ‘how many ways to sum 1 using x through 1′ problem.

Put another way: figure out the answer to ‘how many ways is there to sum 1 using x through 1′, and use that, rather then recomputing. Apply similar logic for 2, 3, 4, 5, etc. In the end, you only do arround 100^2=10,000 computations, instead of 100 million. This is the concept of dynamic programming: solve a subproblem, store the result for use in calculating the greater problem.

Or, at least that’s my understanding of it; this may change when I finally take an algorithms class.

So. Code:

public static void Main()
{
    uniqueSumResults = new int[101, 100];
    for (int i = 0; i < uniqueSumResults.GetLength(0); i++)
        for (int j = 0; j < uniqueSumResults.GetLength(1); j++)
            uniqueSumResults[i, j] = -1;
    Console.WriteLine(NumberOfUniqueSums(5, 4));
    Console.WriteLine(NumberOfUniqueSums(100, 99));
}
 
static int[,] uniqueSumResults;
static int NumberOfUniqueSums(int numToSum, int maxSummand)
{
    if(maxSummand <= 0) return 0;
    if(numToSum < 0) return 0;
    if(numToSum == 0) return 1;
    if(uniqueSumResults[numToSum, maxSummand] != -1)
        return uniqueSumResults[numToSum, maxSummand];
 
 
    int sum = NumberOfUniqueSums(numToSum - maxSummand, maxSummand);
    sum += NumberOfUniqueSums(numToSum, maxSummand - 1);
    uniqueSumResults[numToSum, maxSummand] = sum;
    return sum;
}

Slight explanation: we use the ‘uniqueSumResults’ 2D array as a cache of results. If the value in the cache for a given set of parameters isn’t the initialized -1, then it contains the answer for the inputs; no further calculation necessary.

Euler 077- First value that can be written as sum of more then x primes

From Euler 077-

The goal is to find the first number that can be written as a sum of primes in 5,000 different ways. As an example, 10 can be written as 7+3, 2+2+2+2+2, 5+5, 5+3+2, etc.

Going about this the slow way. The first and immediate thoughts is ‘gee, I can try brute force’. The maximum number that the answer can be is 10,000 (2+2+…+2=2*5000). It’s actually a fair ammount less (2+2+2=3+3, but, my math skills are too terrible to figure out a better upper bound).

Now, I envision that determining all unique sums is quite slow for a single number. Can’t prove but envision. And slow being relative. To speed this up, we use an interesting property of these numbers.

Assume P(k) is the number of unique sums for k; P(k) <= P(k+2), as k+2 can be given simply by 2+(all solutions of k). Similar for P(k+3). I can’t prove it for P(k+1) though, sadly. This means that we could do one of two things: Either do a binary search on all even numbers, then all odd numbers to find the first number with P(k)==5000, (should be about 2*log2(10000)=26 checks, roughly); or, we can do a binary search down to a resolution of about 3, then check those final 3 numbers. It’d be about 16-19 comparisons. As the binary search across all evens then all odds is much simpler/less sticky, we’ll go that route.

Now, humerous fact: I went through all this thinking, and then did a test to see a. how slow finding unique sums is, and b. verify my ‘find unique sums’ algorithm works. The answer is less then 100, and can be found without any of the cool binary searching. I got to do precisely none of the cool things I was planning on. None of them. I feel like life is such a terrible waste, as I probably spent hours working out ways to solve the problem which would be equivelant to ‘find the first k such that P(k) is 10^(something huge)’… very dissappointing. And also, P(k)<=P(k+1) for all observed points.

Edit: Slight explanation of how to find the number of unique sums.
What we do is attempt all summations using primes in decreasing value of the prime. This ensures that any summation found will be unique. We do this recursively, and it can actually be considered dynamic programming; but, alas, the answer is already found.

Code:

static List<int> primes;
public static void Main()
{
    BuildPrimes(10000);
    for (int i = 2; i <= 100; i++)
        Console.WriteLine(i+" "+NumberOfUniqueSums(i, 0));
}
static int NumberOfUniqueSums(int k, int primeStartIndex)
{
    if(k<0) return 0;
    if (k == 0) return 1;
    int sum = 0;
    for (int primeIndex = primeStartIndex; primeIndex < primes.Count; primeIndex++)
        sum += NumberOfUniqueSums(k - primes[primeIndex], primeIndex);
    return sum;
}
static void BuildPrimes(int maxSquare)
{
    primes = new List<int> { 2};
    int maxI=((int)Math.Sqrt(maxSquare))+1;
    for (int primeCandidate = 3; primeCandidate < maxI; primeCandidate += 2)
    {
        bool isPrime = true;
        for (int i = 0; i < primes.Count && isPrime; i++)
            if (primeCandidate % primes[i] == 0)
                isPrime = false;
        if (isPrime)
            primes.Add(primeCandidate);
    }
    //max prime is now at index 0
    primes.Reverse();
}

Euler 206- Concealed Square

Find the unique positive integer whose square has the form 1_2_3_4_5_6_7_8_9_0,
where each “_” is a single digit.

First thought was brute force by replacing blanks with numbers, and checking for a square root. This comes to a mere 10^9 possibilities, with a lot of parsing longs from string. After a bit of analysis though, you realize the last digit of the number we are looking for has to be 0. (0 being the only a*a to result in a 0 digit at the end). This also means the first blank is a 0, giving a mere 10^8 possibilities. Next is to realize that 9 can only be formed by a 3, or a 7 in the number. (Try not to forget the 7, as I did the first time through). So, we now know the target square ends in either 70 or 30.

We can move away a lot of parsing longs from strings, and sqrt’s, by just iterating over the possible answers, rather then the possible answers^2. This basically means iterating from 10^9+30 to something less then 10^10 (sqrt of 1929394959697989900; though I’m far too boring to plug that into a calculator). As already stated, we’re looking for something ending in 30 or 70, so we could start with 10^9+30 and increment by 100, then switch to +70, or alternatively alter increments of 40 and 60.

Probably the most complex thing is checking whether or not a square is of the desired format. For that I get the string representation of the square, and then in a for loop check that every other character is incrementing; with a special check for the 0 case. I envision there are better ways, but this was good enough for this problem.

Code:

static void Main(string[] args)
{
    long start = 1000000000;
    //add 100 to long on the off chance that there's a terrible loss
    //  of precision in the constant double
    long end = (long)(Math.Sqrt(1929394959697989900))+100;
    Test(start + 30, end, 100);
    Test(start + 70, end, 100);
}
static void Test(long start, long end, long incr)
{
    for (long current = start; current < end; current += incr)
    {
        string squareString = (current * current).ToString();
        bool isValid = true;
        for (int i = 1; i <= 9 && isValid; i++)
            if (squareString[2 * (i - 1)] != '0' + i)
                isValid = false;
        if (squareString[2 * (10 - 1)] != '0')
            isValid = false;
        if (isValid)
            Console.WriteLine("Solution: " + current);
    }
}

Euler 001- Starting out Easy

At least something to start the blog with.

Project Euler problem #1; sum all natural numbers less then 1000 that are multiples of 3 or 5. Natural numbers meaning positive integers. The simplest way to go about this is a for loop with a sum, something akin to:

int sum=0;
for(int i=1; i<1000; i++)  
    if(i%3==0 || i%5==0)
        sum+=i;
Console.WriteLine(sum);

Which gives us a nice and specific answer, which is correct. Fun optimization side of this:

You can consider the sum of all 3’s to be 3*sum(1..333), which can be computed as 3*(333*332/2 + 333), if math holds.

Similarly, sum of 5 is 5*sum(1..199) = 5*(199*198/2 + 199).

Next, you need to determine the sum of natural numbers divisible by 15: 15*sum(1..66)=15*67*66/2;

Finally, take the sum of the first two values, and subtract from that the last value. In this case: 166833 + 99500 – 33165 = the answer. This follows from a little too much math, and a bit of set theory: numbers divisible by 15 are counted in both the 3 and 5 cases, effectively double counted. We subtract one of these double counts to end up with a single count of all numbers; and, an O(1) solution.