Beemer (dr_tectonic) wrote,

R Advent: Days 1-5

Colin posted about Advent of Code ( the other day, and the first couple problems are enticing, so I got sucked into doing the whole thing. Well, last night I finished!

It was an enjoyable challenge, and I thought it would be fun to post my solutions here and discuss them a little bit. I did everything in R because that's what I want to get better at, and I tried to write my code as R-tastically as possible. That means two things, mostly: first, aiming for vectorized/functional code over looping/procedural code, and second, using community-developed libraries rather than reinventing the wheel whenever possible. I spend a lot of time googling "how do you do X in R?" anyway (partly because I'm still learning R idioms and partly because inconsistent naming is one of R's biggest weaknesses), and I figure that in the real world, the goal is never to prove that you can do it, but to solve the problem. So if there's a library out there that was written by an expert and will do what you want, use it.

One of the things I like about R is how concise you can be. In other languages, I often feel like I'm spending most of my time and effort keeping a bunch of arrays and counters and indices in sync as I shepherd all the data through the analysis procedure. Whereas in R, it feels more like I put all my effort into doing origami to fold my data up into the correct configuration, and then I can write three lines of code to make magic happen. I'm including how many lines of code (LOC) each solution took; A/B means part 1 took A lines and part 2 took B extra.

Days 1-5

Day 1 [LOC: 5/2]
A nice easy start counting a bajillion left and right parentheses. Vectorization gives you such an immense boost in clarity and concision on things like this that I kind of never want to program without it anymore.

Day 2 [LOC: 8/5]
Just throwing some algebra at a thousand tuples. Learning to use the apply() family of functions is really key to getting R to jump through hoops.

Day 3 [LOC: 11/6]
The existence of functions like cumsum and unique in the core of the language is huge for data analysis.

Day 4 [LOC: 11/8]
This is the first one I had to use some procedural flow control for. It's hard to do stateless functional stuff when you have no upper bound on the calculation. (My first attempts involved shoving larger and larger arrays through the MD5 function until I found a solution -- or ran out of memory. Once I gave up and switched to using a proper while loop, it got much cleaner.)

Day 5 [LOC: 12/14]
I suspect most people's solutions to this involve a lot of traversing a pointer along an array, checking each point and incrementing counters inside a loop. So long as it fits in memory, I think it's much nicer to just expand everything out and check all the possibilities against one another using something like the %in% operator, because then it's just one step. And that's good, because the less code you write, the fewer opportunities there are for bugs.


  • Re-entry

    Now that we are both fully-vaxxed, we have started Doing Things! With people! Outside the house! It's amazing! Three weekends ago, the first…

  • Tieflings

    In the biweekly online D&D game Neal is running, our party is 80% tiefling (half-devils). Not for any role-playing reason or anything, it's just…

  • Immunized

    As of today, I am officially fully immunized against SARS-CoV-2. I'm still working from home (and will be for a while yet), and I'm still wearing a…

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded