Category: Programming

  • Swift Dependency Management Adventures

    I’m in the process of (somewhat belatedly) upgrading the RadioSpiral app to work properly with Azuracast.

    The Apple-recommended way of accessing the stream metadata just does not work with Azuracast’s Icecast server – the stream works fine, but the metadata never updates, so the app streams the music but never updates the UI with anything.

    Because it could still stream (heh, StillStream) the music, we decided to go ahead and deploy. There were so many other things that Azuracast fixed for us that there was no question that decreasing the toil for everyone (especially our admin!) was going to make a huge difference.

    Addressing the problem

    Azuracast supplies an excellent now-playing API in four different flavors:

    • A file on the server that has now-playing data, accessible by simply getting the contents of the URL. This is only updated every 30 seconds or so, which isn’t really good enough resolution, and requires the endpoint be polled.
    • An API that returns the now-playing data as of the time of the request via a plain old GET to the endpoint. This is better but still requires polling to stay up to date, and will still not necessarily catch a track change unless the app polls aggressively, which doesn’t scale well.
    • Real-time push updates, either via SSE over https or websocket connection. The push updates are less load on the server, as we don’t have to go through session establishment every time; we can just use the open connection and write to it. Bonus, the pushes can happen at the time the events occur on the server, so updates are sent exactly when the track change occurs.

    I decided that the websocket API was a little easier to implement. With a little help from ChatGPT to get me an initial chunk of code (and a fair amount of struggling to figure out the proper parameters to send for the connection request),

    I used a super low-rent SwiftUI app to wrap AVAudioSession and start up a websocket client separately to manage the metadata; that basically worked and let me verify that the code to monitor the websocket was working.

    I was able to copy that code inside of FRadioPlayer, the engine that RadioSpiral uses to do the streaming, but then I started running into complications.

    Xcode, Xcode, whatcha gonna do?

    I didn’t want to create an incompatible fork of FRadioPlayer, and I felt that the code was special-purpose enough that it wasn’t a reasonable PR to make. In addition, it was the holidays, and I didn’t want to force folks to have to work just because I was.

    So I decided to go a step further and create a whole new version of the FRadioPlayer library, ACRadioPlayer, that would be specifically designed to be used only with Azuracast stations.

    Initially, this went pretty well. The rename took a little extra effort to get all the FRadio references switched over to ACRadio ones, but it was fairly easy to get to a version of the library that worked just like FRadioPlayer, but renamed.

    Then my troubles began

    I decided that I was going to just include the code directly in ACRadioPlayer and then switch RadioSpiral to the new engine, so I did that, and then started trying to integrate the new code into ACRadioPlayer. Xcode started getting weird. I kept trying to go forward a bit at a time — add the library, start trying to include it into the app, get the fetch working…and every time, I’d get to a certain point (one sample app working, or two) and then I’d start getting strange errors: the class definition I had right there would no longer be found. The build process suddenly couldn’t write to the DerivedData directory anymore. I’d git reset back one commit, another, until I’d undone everything. Sometimes that didn’t work, and I had to throw away the checkout and start over. The capper was “Unexpected error”, with absolutely nothing to go on to fix it.

    Backing off and trying a different path

    So I backed all the way out, and started trying to build up step-by-step. I decided to try building the streaming part of the code as a separate library to be integrated with ACRadioPlayer, so I created a new project, ACWebSocketClient, and pulled the code in. I could easily get that to build, no surprise, it had been building, and I could get the tests of the JSON parse to pass, but when I tried to integrate it into ACRadioPlayer using Swift Package Manager, I was back to the weird errors again. I tried for most of a day to sort that out, and had zero success.

    The next day, I decided that maybe I should follow Fatih’s example for FRadioPlayer and use Cocoapods to handle it. This went much better.

    Because of the way Cocoapods is put together, just building the project skeleton actually gave me some place to put a test app, which was much better, and gave me a stepping stone along the way to building out the library. I added the code, and the process of building the demo showed me that I needed to do a few things: be more explicit about what was public and what was private, and be a little more thoughtful about the public class names.

    A couple hours work got me a working demo app that could connect to the Azuracast test station and monitor the metadata in real time. I elected to just show the URL for the artwork as text because actually fetching the image wasn’t a key part of the API.

    I then hit the problem that the demo app was iOS only. I could run it on MacOS in emulation mode, but I didn’t have a fully-fledged Mac app to test with. (Nor did I have a tvOS one.) I tried a couple variations on adding a new target to build the Mac app, but mostly I ended up breaking the work I had working, so I eventually abandoned that.

    I then started working step by step to include the library in ACRadioPlayer. FRadioPlayer came with an iOS apps (UIKit and SwiftUI), a native Mac app, and a tvOS app. I carefully worked through getting the required versions of the OS to match in the ACWebSocketClient podspec, the ACRadioPlayer Podfile, and the ACRadioPlayer Xcode project. That was tedious but eventually successful.

    Current status

    I’ve now got the code properly pulled in, compatible with the apps, and visible to each of the apps. I’ll now need to pull in the actual code that uses it from the broken repo (the code was fine, it was just the support structures around it that weren’t) and get all the apps working. At that point I can get both of the libraries out on Cocoapods, and then start integrating with RadioSpiral.

    In general, this has been similar to a lot of projects I’ve worked on in languages complex enough to need an IDE (Java, Scala, and now Swift): the infrastructure involved in just getting the code to build was far more trouble to work with and maintain, and consumed far more time, than writing the code itself.

    Writing code in Perl or Python was perhaps less flashy, but it was a lot simpler: you wrote the code, and ran it, and it ran or it didn’t, and if it didn’t, you ran it under the debugger (or used the tests, or worse case, added print statements) and fixed it. You didn’t have to worry about whether the package management system was working, or if something in the mysterious infrastructure underlying the applications was misconfigured or broken. Either you’d installed it, and told your code to include it, or you hadn’t. Even Go was a bit of a problem in this way; you had to be very careful in how you got all the code in place and that you had gotten it in place.

    Overall, though, I”m pretty happy with Cocoapods and the support it has built in. Because FRadioPlayer was built using Cocoapods as its package management, I’m hoping that the process of integrating it into RadioSpiral won’t be too tough.

    [From the future: it was, and I ended up abandoning that too.]

  • Re-upping WebWebXNG

    So it’s been a minute since I did any serious work on WebWebXNG.

    Initially, I decided that the easiest way forward was “translate this old CGI code into modern web code”. And up to a point, that was a good way to go. But I got to the point where I was trying to make the rubber meet the road, and the intertwining of templating and code in the old version was making me stall out.

    I’ve had a breather, working on other projects, and the world has moved on and brought me some new things. One in particular is htmx.

    The htmx library works a whole lot more like the old CGI world did, just better. Everything is capable of interacting with the user, all of the HTTP verbs are available, and interaction is by exchanging chunks of HTML. You don’t convert to JSON, then convert back to HTML. This kind of logic definitely fits better with the concept of WebWebX as-it-was.

    Also, Perl project management has definitely changed — and improved. I did like Dist::Zilla, but it’s definitely a heavyweight solution. In the meantime, Minilla has appeared, and it fits incredibly well into the model I want to use to manage the code:

    • Your module is Pure Perl, and files are stored in lib.
    • Your executable file is in script directory, if there is one.
    • Your dist sharedirs are in share, if you have any.
    • Your module is maintained with Git and git ls-files matches with what you will release.
    • Your module has a static list of prerequisites that can be described in a cpanfile.
    • Your module has a Changes file.
    • You want to install via cpanm.

    I do have a working page storage engine, which is good, but the interaction engine is definitely nowhere. I’m coming back to the project with fresh eyes, and I’m going to redesign it top-to-bottom to use htmx for all the backend interaction.

    Looking forward to this, and the next iteration of WebWebXNG starts now.

  • JSON, Codable, and an illustration of ChatGPT’s shortcomings

    A little context: I’m updating the RadioSpiral app to use the (very nice) Radio Station Pro API that gives me access to useful stuff like the station calendar, the current show, etc. Like any modern API, it returns its data in JSON, so to use this in Swift, I need to write the appropriate Codable structs for it — this essentially means that the datatypes are datatypes that Swift either can natively decode, or that they’re Codable structs.

    I spent some time trying to get the structs right (the API delivers something that makes this rough, see below), and after a few tries that weren’t working, I said, “this is dumb, stupid rote work – obviously a job for ChatGPT.”

    So I told it “I have some JSON, and I need the Codable Swift structs to parse it.” The first pass was pretty good; it gave me the structs it thought were right and some code to parse with – and it didn’t work. The structs looked like they matched: the fields were all there, and the types were right, but the parse just failed.

    keyNotFound(CodingKeys(stringValue: "currentShow", intValue: nil), Swift.DecodingError.Context(codingPath: [CodingKeys(stringValue: "broadcast", intValue: nil)], debugDescription: "No value associated with key CodingKeys(stringValue: \"currentShow\", intValue: nil) (\"currentShow\").", underlyingError: nil))

    Just so you can be on the same page, here’s how that JSON looks, at least the start of it:

    {
    	"broadcast": {
    		"current_show": {
    			"ID": 30961,
    			"day": "Wednesday",
    			"date": "2023-12-27",
    			"start": "10:00",
    			"end": "12:00",
    			"encore": false,
    			"split": false,
    			"override": false,
    			"id": "11DuWtTE",
    			"show": {...

    I finally figured out that Swift, unlike Go, must have field names that exactly match the keys in the incoming JSON. So if the JSON looks like {broadcast: {current_show... then the struct modeling the contents of the broadcast field had better have a field named current_show, exactly matching the JSON. (Go’s JSON parser uses annotations to map the fields to struct names, so having a field named currentShow is fine, as long as the annotation says its value comes from current_show. That would look something like this:

    type Broadcast struct {
        currentShow  CurrentShow `json:currentShow`
        ...
    }
    
    type CurrentShow struct {
       ... 

    There’s no ambiguity or translation needed, because the code explicitly tells you what field in the struct maps to what field in the JSON. (I suppose you could completely rename everything to arbitrary unrelated names in a Go JSON parse, but from a software engineering POV, that’s just asking for trouble.)

    Fascinatingly, ChatGPT sort of knows what’s wrong, but it can’t use that information to fix the mistake! “I apologize for the oversight. It seems that the actual key in your JSON is “current_show” instead of “currentShow”. Let me provide you with the corrected Swift code:”. It then provides the exact same wrong code again!

    struct Broadcast: Codable {
        let currentShow: BroadcastShow
        let nextShow: BroadcastShow
        let currentPlaylist: Bool
        let nowPlaying: NowPlaying
        let instance: Int
    }

    The right code is

    struct Broadcast: Codable {
        let current_show: BroadcastShow // exact match to the field name
        let next_show: BroadcastShow.   // and so on...
        let current_playlist: Bool
        let now_playing: NowPlaying
        let instance: Int
    }

    When I went through manually and changed all the camel-case names to snake-case, it parsed just fine. (I suppose I could have just asked ChatGPT to make that correction, but after it gets something wrong that it “should” get right, I tend to make the changes myself to be sure I understood it better than the LLM.)

    Yet another illustration that ChatGPT really does not know anything. It’s just spitting out the most likely-looking answer, and a lot of the time it’s close enough. This time it wasn’t.

    On the rough stuff from the API: some fields are either boolean false (“nothing here”) or a struct. Because Swift is a strongly-typed language, this has to be dealt with via an enum and more complex parsing. At the moment, I can get away with failing the parse and using a default value if this happens, but longer-term, the parsing code should use enums for this. If there are multiple fields that do this it may end up being a bit of a combinatorial explosion to try to handle all the cases, but I’ll burn that bridge when I come to it.

  • word break: now where have I seen this before?

    I’m now doing some paired leetcode study sessions with a former ZipRecruiter co-worker, Marcellus Pelcher (LinkedIn). As he says, “I’ve been doing these [leetcode exercises] for a long time!” and it’s been great both to get to spend some time with him and to work together on some problems.

    I’m generally doing OK on easy problems so after getting our video call working, we looked at the word break problem. This essentially boils down to a pattern match: given a list of words, and a string of characters, figure out if the string can be broken into pieces that all match a word in the list of words.

    Examples?

    • leetcode and ["leet", "code"] succeeds (by inspection!)
    • catsandog and ["cat", "cats", "sand", "dog"] fails (no combo lets us match the final og).

    We agreed that this was probably meant to be a dynamic programming problem (way to saddle myself with a hard one right out of the gate!). I put together a recursive greedy algorithm (similar to the good old SNOBOL4 pattern matcher) in ten minutes or so, with a certain amount of flailing at accessing the strings, but it took too long to get to a solution, and it was a huge memory hog. Marcellus did a dynamic programming solution in Java that used a bit array to track the matches, and had it done and passing in about 15-20 minutes. So one for him and zero for me! πŸ™‚

    I let the solution rattle around in my head and after taking a shower, it suddenly dawned on me that this was very much like the coin change problem, but instead of reducing the count, we’re trimming off substrings. I went back to work from there and finally realized that the bit array was tracking the ends of successful matches, which made the dynamic programming recurrence go like this:

    • Start at the beginning of the string (position 0) and find all words that match at that position. Record where they end. That’s our starting configuration.
    • Continue moving forward through the string, comparing all the words at the current position.
    • If the word matches at the current position, it successfully extends a match if and only if a previous match ended right before where this one starts. If this is true, then we record the end of the newly-matched word. Otherwise we just continue to the next word.
    • If a match ends at the end of the string, we’ve successfully found some match or another; for this problem, that’s all that matters.

    Let’s do an example. Let’s say our string is “catsupondog” and our words are “cat”, “cats”, “catsup”, “upon”, “on”, and “dog”. At position zero, “cat”, “cats”, and “catsup” all matched, so we’ll record the ending indexes in a bit array. After our startup matches, things look like this:

    catsupondog
    0123456789A
    00110100000
     AA A
     || |
     || +-- catsup matched up to here
     |+------ cats matched up to here
     +-------- cat matched up to here

    As the index moves up through the string, we keep checking all the words against it. If we match another word, we then look to see if the position just before the start of this new match marks the end of a successful match.

    When we get to position 4, “upon” matches, and position 3 is true, so this extends the match, and we mark position 7 as the end of a successful match.

    catsupondog
    0123456789A
    00110101000
       A   A
       |   |
       |   +--- "upon" extends the "cats" match
       +------- "cats" match ended here

    When we get to position 6, “on” matches, and we mark position 7 as end of a match again. (Note that marking it twice is fine; we just have two different successful matches that end there, and for this problem, which one got us to 7 doesn’t matter.)

    catsupondog
    0123456789A
    00110101000
         A A
         | |
         | +--- "on" extends the "catsup" match
         +----- "catsup" match ended here

    When we finally get to position 8, “dog” matches, and position 7 is true, so this extends the match again. We’re at the end, so we’re done, and successful. We don’t have to check any more past position 8.

    catsupondog
    0123456789A
    00110101001
           A  A
           |  |
           |  +--- "dog" matches, reaches the end, and succeeds
           +------ end of "on" and "upon" matches; which one doesn't matter
            

    Enough chatter, here’s some code.

    func wordBreak(s string, wordDict []string) bool {
        
        // Records where matches ended successfully.
        matched := make([]bool, len(s))
    
        for i := 0; i < len(s); i++ {
            // Check every word at the current position.
            for _,word := range(wordDict) {
                if i + len(word)-1 < len(s) {
                    // Word can fit in remaining space
                    if s[i:i+len(word)] == word {
                        // matches at the current index
                        if i == 0 || matched[i-1] {
                            // If we're at zero, we always succeed.
                            // Otherwise, check if a successful match ended right
                            // before this one. If it did, mark the end of this
                            // match as successful too.
                            matched[i+len(word)-1] = true
                            if i+len(word)-1 == len(s)-1 {
                                // Short-circuit if we just successfully
                                // reached the end with this match
                                break
                            }
                        }
                    }
                }
            }
        }
        return matched[len(s)-1]
    }

    Bottom 7% in performance, but I’m giving myself a win because I actually figured out a dynamic programming problem by getting what the recurrence was. Marcellus told me, I just didn’t follow his explanation! [EDIT: I removed a fmt.Println of the bit array inside the loop and now it’s top 80%! Calling that a pass for an interview too.]

  • Making change via dynamic programming: easy when you see it

    I failed to see this is an induction problem, and did not think of an elegant way to represent “we have no solution” for a given amount.

    The problem description is “given a set of coins of n denominations, figure out the minimum number of coins needed to make change for a given amount. If you can’t make change, return -1.”

    So here’s the induction:

    • Assume it’s impossible to make change for any amount up to the target amount by setting the number of coins to > the max amount. (For this problem it was 2**32, which I’ll refer to as “impossible”).
    • For an amount of 0, it takes 0 coins to make change.
    • For each amount after 0, the number of coins needed to make change for this amount is 1 of the current coin, plus however many coins it took to make change for amount - current coin value.
    • If we couldn’t make change for amount - current coin value (value for that is impossible), then try the next coin.
    • If we run out of coins, we can’t make change for that amount, leave it set to impossible, and we move on to the next amount up.
    • When we reach the final amount, we’ve either found a chain of sub-solutions that reach 0 (change is made with n coins, the sum of all the sub-solutions) or it’s impossible.

    Because this builds a table of solutions from the bottom up, we’re always guaranteed that the solution for any amount < the current amount has already been solved, and we always try all the coins for every amount, so we’re guaranteed to find a solution if there is one, even if the coins are not in sorted order.

    I chose to use int64‘s and set the FAILURE amount to an amount specified in the problem description as definitely larger than any amount that is possible. You could do it with a map[int]int, checking for “no entry”, but using FAILURE allows the lookback to always work with just a less-than test, so it’s probably faster. I’ve updated the variable names to make it clearer how this works.

    
    func coinChange(coins []int, amount int) int {
        const FAILURE int64 = 5000000000
    
        // Assume everything is impossible...
        changeFor := make([]int64, amount+1)
        for i := 0; i < len(changeFor); i++ {
            // effectively Inf
            changeFor[i] = FAILURE
        }
    
        // ...except 0. Change for 0 is 0 (it takes no coins to make change
        // for 0.
        changeFor[0] = 0
    
        // For every amount up to the target amount (i.e., solve every
        // subproblem):
        for currentAmount := 1; currentAmount <= amount; currentAmount++ {
    
            // Try every coin we have to solve the problem.
            for _, coin := range(coins) {
    
                // If this coin might work...
                if coin <= currentAmount {
    
                    // Get the answer for the subproblem.
                    lookBack := changeFor[currentAmount - coin]
                    
                     // This if statement is doing a lot of elegant
                     // heavy lifting.
                     //
                     // If the subproblem had no solution, the lookback is
                     // FAILURE, and FAILURE + 1 is greater than the current
                     // value (FAILURE or less). This guarantees that we don't
                     // overwrite any solution that was already found, and that
                     // we leave it as FAILURE for this coin!
                     //
                     // If the subproblem DID have a solution, and it's better
                     // (less) than what we have (some number of coins or
                     // FAILURE), then we change the current solution for this
                     // amount to the new count.
                     //
                     // This is why the order of the coins in the coin array
                     // doesn't matter: we will always try ALL the coins for
                     // a given amount, and if a different coin has a better
                     // (smaller) solution, we'll change the solution for the
                     // current amount for the better one!
                     //
                     // This means we're guaranteed to have the best solution
                     // for every amount < the current amount (including 0),
                     // so composing a solution for the current amount from
                     // previous solutions always gets the best one for this
                     // amount.
                     //
                     // The fact that it takes a comment THIS LONG to accurately
                     // explain TWO LINES OF CODE...!
                    if lookBack + 1 < changeFor[currentAmount] {
                        // Previous number of coins solving this, plus one more
                        // for the current coin solving this amount better than
                        // we previously did. We never get here if we don't find
                        // a solution for any of the coins, leaving this set to
                        // FAILURE.
                        changeFor[currentAmount] = lookBack + 1
                    }
                }
            }
        }
        // If we never found a solution, the caller wants -1.
        if changeFor[amount] == FAILURE {
            return -1
        }
        // Retrieve the solution from the solution cache. If we were
        // going to use this a lot, we'd save the cache.
        return int(changeFor[amount])
    }
    

    I hope this commented code shows how disgustingly elegant this is. I didn’t find this solution, but I feel like working through why it works has helped me get a better grasp on dynamic programming.

    Lessons learned:

    • What can you solve by inspection?
    • How do you mark things as “not solved” or “insoluble”?
    • Don’t be afraid to solve a lot of seemingly unneeded subproblems if you can use them to break down the problem you have.

    I initially tried a greedy solution for this problem, and that is actually optimal if the set of coins you’re given can always make up any number. (E.g., US coins are 1 cent, 5 cents, 10 cents, 25 cents, and 50 cents; you can use those to make any number of cents from 1 to 100.) If the set of coins doesn’t guarantee this, then you have to use the dynamic programming approach to solve it.

    The second, recursive try might have worked if I’d cached solutions found along the way, effectively turning the iterative lop here into a recursive one, but I didn’t and it DNF’ed.

    Right, stats. 79th percentile on runtime, 80th on memory. Pretty good.

  • leetcode month review

    Summary: I am going to have to do a lot more practice, and brush up on some things, especially dynamic programming, if I’m going to do well on medium to hard questions.

    The good:

    • The practice in the past few weeks has got the idiosyncrasies of Go slices and arrays mostly under my belt. Stuff I was struggling with in the first few exercises is now pretty reflexive. I’m reasonably certain that I can manipulate slices as I need to.
    • I really do remember pointer algorithms from my systems programming days. I was, in some cases, finding it easier to implement algorithms using linked lists for variable linear storage than slices. (Slices have improved a lot though, so I’m not reaching for linked lists first.)
    • I’m doing okay with recursion now.
    • Tree algorithms are not as bad. The medium height-order problem was easy for me to conceptualize; I just had trouble getting the slice of slices to work.
    • If the code can use a queue-based algorithm, I have no problem writing it. 01 matrix was a breeze once I committed to trying the queue, and the same kind of solution worked well for the “clone a graph” problem. I can also see how I could have used it for flood fill.
    • I knocked out the k points closest to the origin problem well under the time limit with a very fast solution; it was pure “how do I fill and then sort a custom data structure?”.
    • sort.Ints, sort.Slice, and sort.Search are now my very good friends and help immensely.

    The bad:

    • I am pretty bad at numerical algorithms, like 2sum/3sum, or the inserted interval. I am not intuiting the algorithms at all well.
    • I am still not properly catching myself falling down rabbit holes or not being on the right track. I need some heuristics. Maybe “if the if statement are more than 2 deep, you may be off course”.
    • If I get a dynamic programming problem, I’m sunk. I might be able to code something if it ends up having a Fibonacci relation, but anything more complex is currently beyond me. The DP solution to the length of the longest substring without repeats is just completely black magic. It works, but I cannot see how it possibly could.

    leetcode feels like programming’s highs and lows all condensed into a single highly compressed experience: sometimes it’s absolutely crystal clear, and you just need to get it all typed in (all of the copying of the graph into an adjacency array and recloning the nodes worked the first time; the only thing that didn’t was the link builder, which I had left one line out of).

    So sometimes I’m enjoying it, and sometimes I am very much not. I sorely miss a real debugger, but I’m getting along with fmt.Println. I would give a lot to be able to build things the way I prefer, bottom up with tests, but leetcode not gonna work that way.

  • leetcode day 30: RPN

    Take an array of strings in RPN and evaluate it.

    This is all about managing the stack. I started trying to code a Stack class, but dumped that for simply using append and slicing to manage the stack. Performance was okay: 73% runtime, 41% memory; probably good enough for a pass.

    func evalRPN(tokens []string) int {
        stack := []int{}
        for _,token := range(tokens) {
            switch token {
                case "+":
                op2 := stack[len(stack)-1]
                op1 := stack[len(stack)-2]
                stack = stack[:len(stack)-1]
                stack[len(stack)-1] = op1 + op2
    
                case "-":
                op2 := stack[len(stack)-1]
                op1 := stack[len(stack)-2]
                stack = stack[:len(stack)-1]
                stack[len(stack)-1] = op1 - op2
    
                case "/":
                op2 := stack[len(stack)-1]
                op1 := stack[len(stack)-2]
                stack = stack[:len(stack)-1]
                stack[len(stack)-1] = op1 / op2
        
                case "*":
                op2 := stack[len(stack)-1]
                op1 := stack[len(stack)-2]
                stack = stack[:len(stack)-1]
                stack[len(stack)-1] = op1 * op2
    
                default:
                  i, err := strconv.Atoi(token)
                  if err != nil {
                      panic(fmt.Sprintf("unparseable token %s", token))
                  }
                  stack = append(stack, i)
            }
        }
        return stack[len(stack)-1]
    }

    My guess is that if I used a real stack pointer to manage the stack, it’d be faster.

    func evalRPN(tokens []string) int {
        stack := make([]int, 10000)
        sp := -1
        for _,token := range(tokens) {
            switch token {
                case "+":
                op2 := stack[sp]
                op1 := stack[sp-1]
                sp--
                stack[sp] = op1 + op2
    
                case "-":
                op2 := stack[sp]
                op1 := stack[sp-1]
                sp--
                stack[sp] = op1 - op2
    
                case "/":
                op2 := stack[sp]
                op1 := stack[sp-1]
                sp--
                stack[sp] = op1 / op2
    
                case "*":
                op2 := stack[sp]
                op1 := stack[sp-1]
                sp--
                stack[sp] = op1 * op2
    
                default:
                  i, err := strconv.Atoi(token)
                  if err != nil {
                      panic(fmt.Sprintf("unparseable token %s", token))
                  }
                  sp++
                  stack[sp] = i
            }
        }
        return stack[sp]
    }

    And no, it is not! Exact same runtime percentile, and used a little less memory (35th percentile).

    Couple of quite elegant idea in the solutions; I really liked this one:

    func evalRPN(tokens []string) int {
        stack := make([]int, 10000)
        sp := -1
        ops := map[string]func(int,int) int{
            "+": func(i, j int) int { return i + j},
            "-": func(i, j int) int { return i - j},
            "/": func(i, j int) int { return i / j},
            "*": func(i, j int) int { return i * j},
        }
        for _,token := range(tokens) {
            i, err := strconv.Atoi(token)
            if err != nil {
                // operator
                stack[sp-1] = ops[token](stack[sp-1], stack[sp])
                sp--
            } else {
                sp++
                stack[sp] = i
            }
        }
        return stack[sp]
    }

    We get rid of the switch altogether using the fact that the tokens are either valid numbers or operators, and we make the operators into anonymous functions in a map. If the token fails to convert to a number, then we look it up and call the appropriate function. This is far more fragile though, since any bad token will result in a “no such key” error on the ops map…and we only gain 3% in the runtime. I’ll settle for the first one as good enough for a pass.

  • leetcode day 30: clone a graph

    Given a graph, clone it and return it with all links intact.

    I chose to use a caching crawl of the graph, and then reconstruct the new graph from the cache.

    /**
     * Definition for a Node.
     * type Node struct {
     *     Val int
     *     Neighbors []*Node
     * }
     */
    
    
    func cloneGraph(node *Node) *Node {
        // need to crawl the graph and make sure we've visited all
        // the nodes, saving the connections at each node. Once we
        // have all the nodes and connections, we can rebuild the
        // graph.
    
        // start at the first node. add it to the queue.
        // while the queue is not empty:
        //  - pull the next item
        //  - if this node is not in the map, add it, and add all the
        //    values from the nodes it links to as the adjacency list.
        //  - for each node in the adjacency list:
        //      if the node is not in the map, add it to the queue
        //         to be explored
        if node == nil {
            return nil
        }
    
        rootNodeVal := node.Val  // will be used to find and return new root
    
        queue := []*Node{node}
        // adjacency list for scanned nodes:
        //   key is node pointer
        //   value is list of adjacent nodes
        adj := map[*Node][]*Node{}
    
        for len(queue) != 0 {
            item := queue[0]
            if len(queue) == 1 {
                queue = []*Node{} // empty it
            } else {
                queue = queue[1:]
            }
            for _,neighbor := range(item.Neighbors) {
                if _, ok := adj[neighbor]; !ok {
                    // not visited, add to queue
                    queue = append(queue, neighbor)
                }
            }
            adj[item] = item.Neighbors
        }
    
        // all nodes crawled. each entry in the map is a node address ->
        // list of neighbors.
        
        type task struct {
            from int
            to int
        }
    
        valToNode := map[int]*Node{}
        linkTasks := []task{}
        for nodeptr, neighbors := range(adj) {
            // we have a pointer to the old node, and all the things it points to.
            // - allocate a new node and record it.
            n := &Node{
                Val: nodeptr.Val,
                Neighbors: []*Node{},
            }
            valToNode[nodeptr.Val] = n
            for _,linkedNode := range(neighbors) {
                if target, ok := valToNode[linkedNode.Val]; ok {
                    // Node has been reallocated, link it
                    n.Neighbors = append(n.Neighbors, target)
                } else {
                    // not allocated yet, add link task for it
                    linkTasks = append(linkTasks, task{from:n.Val, to:linkedNode.Val})
                }
            }
        }
        fmt.Println(valToNode)
        // At this point we have allocated all the nodes, but have links
        // left to handle in the queue. for each item in the queue:
        //    - look up the from and to nodes in the map
        //    - append a link to the 'to' node to the 'from' node's link list.
        // when the queue is empty, the links have been built, and we're done.
        for _,task := range(linkTasks) {
            fmt.Println(task.from, task.to)
            fmt.Println(valToNode[task.from])
            fmt.Println(valToNode[task.to])
            fromNode := valToNode[task.from]
            toNode := valToNode[task.to]
            fromNode.Neighbors = append(fromNode.Neighbors, toNode)
        }
        // return the root node.
        return valToNode[rootNodeVal]
    }

    Works, but slow: 32nd percentile runtime, 8th percentile memory. So there must be some tricks here. (Betting on arrays instead of maps.) Yes, I’ve overengineered it: there are 100 nodes at max, and the node.Val willl be 1-100. SO I can allocate an array instead of a map for my valToNode map. Let’s try that first and see how much speed we get…absolutely none! Exactly the same performance!

    Well. What else can I do to speed this up? All right, let’s make the initial adjacency computation just store the integer node values. Since these are unique per node, and we have 100 or less, we can make that [][]int and see what happens.

    Had a little trouble that I tracked down to the queue drop getting lost in the adjacency computation. Once that was back in, it worked fine, and now is perfectly acceptable: 78th percentile CPU. Memory still sucky, but I’ll take it.

  • leetcode day 30 – definitely struggling

    3 sum: given a list of numbers, find all distinct triples that sum to 0. The hint was to transform this to a 2 sum problem, and look for the 2 sum that solves the problem for the current index.

    My initial try spent a lot of effort trying to track the used values in maps, and I did not learn my lesson: when the solution feels too hard, you’re probably doing it wrong.

    After I ran out of time, I started looking at solutions; I had started trying to sort the answers and see if there were duplicates, and it sort of worked, but failed some test cases in a non-obvious way.

    The real attack for this kind of problem starts out by sorting the numbers first. You can then start depending on math identities to help winnow the solutions, eliminate duplicates, and shrink the search space all at once.

    Here’s the solution that was clearest, with my comments and variable name adjustments to make the code clearer yet.

    func threeSum(nums []int) [][]int {
        // Output slice.
        out := [][]int{}
        // Sort the numbers ascending first.
        sort.Ints(nums)
    
        var left, right, sum int
    
        // We're going to search for three terms. We're going to just move
        // left to right for the first one.
        for first, target := range nums {
            // Skip duplicate terms at the start. We always look at nums[0]
            // and start deduping after that.
            if i > 0 && a == nums[i-1] {
                // Same as previous so any solutions using this would be dups.
                // Skip to the next item.
                continue
            }
            // Start with second item just after the first and third at the end.
            second, third = first + 1, len(nums) - 1
            // As long as the second pointer hasn't hit the third, keep looking.
            for second < third {
                // calculate the sum
                sum = target + nums[second] + nums[third]
                if sum == 0 {
                    // This is a solution, save it.
                    out = append(out, []int{target, nums[second], nums[third]})
                    // Skip duplicates of the second term. We increment once to
                    // move to the item after the one we're dedeuping.
                    second++
                    for second < third && nums[second] == nums[second-1] {
                        second++
                    }
                } else if sum < 0 {
                    // second number is too small to work; third is the largest
                    // as-yet-unused value, so the second has to be larger to
                    // move the sum closer to zero.
                    second++
                } else { // sum is > 0
                    // third number is too large to work. The second number is
                    // already the smallest possible for the next solution, so
                    // we have to reduce the third to move closer to zero.
                    third--
                }
            }
        }
        return out
    }
    
    

    This is good clever as opposed to bad clever, as it’s clear how this actually works. Notice how we keep shrinking the search space as we move the first pointer upward, and how the duplicate elimination at the first and second pointers speeds this up even further. We’re still O(N^2), but we’re discarding as much possible duplicate work as we can.

    This is the kind of problem that I find quite difficult; I am not very friendly with numbers, despite a lot of programming over the years. I have spent more time chasing control block chains and securing inputs than I have shuffling numbers, and it shows: I have a lot less trouble with linked lists than these numerical problems.

    Not feeling great about my chances to pass one of these live at the moment. Maybe it’s just “more practice needed”. I hope it is, anyway.

  • leetcode day 29 – longest substring

    First solution worked, but was slow and used a lot of memory (5th percentile for both):

    func lengthOfLongestSubstring(s string) int {
        if s == "" {
            return 0
        }
        
        // This looks like a sliding window.
        longest := []byte{}
        current := []byte{}
        m := make(map[byte]int)
        bs := []byte(s)
        
        i := 0
        scan := 0
    
        for {
            // scan from current point.
            m[bs[scan]]++
            if m[bs[scan]] < 2 {
                // still no repeat, now add it
                current = append(current, bs[scan])
                scan++
                if scan == len(bs) {
                    // no more chars to add, current finished
                    // and string finished
                    break
                }
            } else {
                // repeat; restarting scan at next point
                // for a new possible substring, see if 
                // the new substring is the new longest.
                if len(current) > len(longest) {
                    longest = current
                }
                // in either case, we have to try for a new
                // non-repeater in the next window
                i++
                if i == len(bs) {
                    // no new window left, finish up
                    break
                }
                scan = i
                current = []byte{}
                m = make(map[byte]int)
            }
        }
        if len(current) > len(longest) {
            longest = current
        }
        return len(longest)
    }

    So this is the right track — the map to manage the characters we’ve seen is definitely right, but there has to be a trick here — and that is to use the index of the character as the hash value, and use that to help us manage the sliding window.

    func lengthOfLongestSubstring(s string) int {
        if s == "" {
            return 0
        }
    
        m := make(map[byte]int)
        bs := []byte(s)
        
        start := 0
        maxLen := 0
        currentLen := 0
    
        // Iterate through the string and slide the window each
        // time you find a duplicate.
    	// end is the end of current substring
    	for end := 0; end < len(bs); end++ {
    		// Check if the current character is a duplicate
    		if dupIndex, ok := m[bs[end]]; ok {
                // New max?
                if maxLen < end - start {
                    maxLen = end - start
                }
    			// forget index for all characters before the dup
    			m = make(map[byte]int)
    
    			// Slide the window to just after the dup.
    			start = dupIndex + 1
                currentLen = 0
                end = start
    		}
    		// Add the current character to hashmap
    		m[bs[end]] = end
            currentLen++
    	}
        if currentLen > maxLen {
            maxLen = currentLen
        }
        return maxLen
    }

    That’s a little better, but it’s still pretty slow. 13th percentile runtime, 20th percentile CPU. (If you use strings, the percentiles drop to 5 and 5. No better than the dumb solution!) Looking at the solutions, it is possible to do the whole thing with indexes…but I definitely do not grok it.

    func lengthOfLongestSubstring(s string) int {
        if s == "" {
            return 0
        }
        if len(s) == 1 {
            return 1
        }
    
        // brackets current longest found
        startMax := 0
        endMax := 0
    
        // start of window
        start := 0
        dupIndex := 0
    
         // Scan end pointer of window forward
        for end := 1; end < len(s); end++ {
            // The "I didn't find it" value
            dupIndex = -1
            // looking for duplicate inside window
            for j := start; j < end; j++ {
                if s[j] == s[end] {
                    // Found it
                    dupIndex = j
                    break
                }
            }
            // Dup found inside window
            if dupIndex >= 0 {
                if (end - start - dupIndex) > (endMax - startMax) {
                  startMax = dupIndex
                  endMax = end
                }
                start = dupIndex + 1
            } else {
                // no dup in window
                if (end - start + 1) > (endMax - startMax) {
                  startMax = start
                  endMax = end + 1
                }
            }
        }
        return endMax - startMax
    }

    71st percentile CPU, 96th percentile memory (memory makes sense, we barely use anything other than indexes into the string). However, the big winner is dynamic programming:

    func lengthOfLongestSubstring(s string) int {
        rv := 0
        // array as hash for extra speed and no dynamic memory.
        // create it, mark it all invalid.
        last := [256]int{}
        for i := range last {
            last[i] = -1
        }
    
        // define min inline.
        min := func(a, b int) int {
            if a < b {
                return a
            }
            return b
        }
    
        size := 0
        for i, b := range []byte(s) {
            size = min(size + 1, i - last[b])
            last[b] = i
            rv = rv + size - min(rv, size) // max(rv, size)
        }
    
        return rv
        
    }

    This is black magic and I do not understand it yet!