Category: Go

  • leetcode month, day 2: Fenceposted by “valid parens”

    Today’s challenge is Valid parens. And it reminded me of several of the things that I don’t particularly like in Go.

    Problem statement boils down to “here’s a string of brackets: parens, square brackets, braces. Tell us if they match properly (true) or don’t (false)”. Algorthmically this is dead simple:

    Start with an empty stack.
    for each character in the string:
      if it's an open bracket, push it.
      if its a close bracket:
        if the stack is empty, return false.
        if the top of the stack isn't the matching open, return false.
        pop the stack.
    
    When we're done, return true if the stack is empty, false otherwise.

    In Perl or Python, I’d have a stack pointer, and move it up and down an array as I pushed and popped characters by altering the stack pointer and assigning to the appropriate point in the array. If the stack pointer went below 0, I’ve gotten too many close parens, and I return false.

    This approach doesn’t work well at all with Go, because Go arrays and slices Do Not Work Like That.

    To push onto an existing array, we need to use a := append(a, new). Just assigning to an index outside of the slice’s current boundaries will panic.

    We could still use the stack pointer to move backward, but then we’d have to have more complex code to decide if we need to append or just move the pointer. (Side note: doing it that way would probably use less memory — the working version of my solution used more memory than 90% of the other solutions). Instead, we just use the slice notation to pop the stack instead, with a := a[:len(a)-1].

    My original iterations with a real stack pointer failed miserably because I wasn’t properly tracking the length of the array, and was perpetually getting the stack pointer position wrong, causing the code to panic. It was only after completely discarding the stack pointer that I got it to finally work.

    Items of note:

    • I remembered I’d need to use range() to iterate over the string, but forgot that I would have to string() the characters I was getting so they were strings instead of runes. I wasted a good chunk of time trying to mess with runes before I realized that I needed string() instead. Runes are considerably more second-class citizens.
    • Still running afoul of using make() properly. Finally figured out the syntax to create an empty string array, but it took some Googling to get it.
    • I decided to make the code less complex by creating a pairs map that mapped left brackets to right ones. This meant I could check pairs[left] == right for whether I’d matched the left and right brackets. It also meant that if we ever added more bracket pairs in the future, it’d be be easier to implement them.
    • I got fenceposted by a[len(a)-1] accessing the last element, and a[:len([a)-1] dropping the last element. I naively expected that since a[len(a)-1] accessed the last element, I’d want a[:len(a)-2] to drop that last element, but that takes one too many, running me into another fencepost. Yes, it’s good that it’s symmetric, and now I’ll remember, but I definitely had to look it up to figure out both of them.
    • Forgot the check for “did we close all the opens”. I probably would not have missed it if I was making my own test cases, but it wasn’t in the tests. Note to self: watch out for poor initial test cases. There is an opportunity to add more, I think?

    So how’d I do overall? We saw I was in the bottom 10% in memory usage, boo, but runtime? “Beats 100.00% of users with Go”.

    I’ll definitely take that as a win, but it’s definitely obvious I need to keep practicing and getting these idioms back under my belt.

    func isValid(s string) bool {
        // We'll run through the string, doing the following:
        // - if we see an open bracket, we stack it.
        // - if we see a close bracket, we pop the stack. If the
        //   brackets match, we continue, otherwise we fail.
        // - when we run out of brackets, the stack must be empty or we fail.
    
        stack := make([]string,0)
    
        pairs := map[string]string{
            "(": ")", 
            "[": "]", 
            "{": "}",
        }
    
        for i := range s {
            char := string(s[i])
           _, ok := pairs[char]
           if ok {
               // left, push it
               stack = append(stack, char)
           } else {
               // right, stack empty?
               if (len(stack) == 0) {
                   fmt.Println("pop of empty stack")
                   return false
               }
               // Not empty. match?
               if (pairs[stack[len(stack)-1]] != char) {
                   fmt.Println ("mismatch")
                   // mismatched bracket, fail
                   return false
               }
               // Match. Pop stack.
               stack = stack[:len(stack)-1]
           }
        }
        // Check for "did we close all the open brackets".
        return len(stack) == 0
    }

    That’s one for today, and I need to clean the house, so I’ll call it a day.

  • leetcode month, day 1: Twosum and Add Two Numbers

    Well, these were fun, and definitely showed me I’m not ready to walk into an interview at the moment!

    Twosum

    Twosum is an easy challenge: you are given an array of numbers and a target value; your job is to find the only two numbers at different indexes in the array that sum to the target value and return those two indexes.

    The brute-force strategy just repeatedly walks through the array, trying to find the right pair of numbers. This ends up being O(n^2) because we essentially run over the upper triangle of a square of the elements of the array> If in the example below were looking for numbers that sum to 9, we’d run over the numbers four times finally finding it on the fourth pass, and doing 4+3+2+1 = 10 checks. (We can save operations by precomputing the second term in the sum: subtract the first search value from the target sum, and just compare that against the value in the array at that index instead of summing the two.)

    >1  2  3  4  5
     1 >2  3  4  5
     1  2 >3  4  5
     1  2  3 >4  5

    But the old Perl adage “when someone says search, consider a hash” comes in handy here. We can precompute exactly the number we want, so if we can make the values the keys of a hash (and we can, as we know that there is only one pair that sums to the target value), then we can make the indexes the values.

    We walk through the array twice:

    • The first iteration builds the hash by taking each value and inserting it as the key with the index as its value. And since we’re doing one iteration over the whole array anyway at this point, we can check each item as we insert it to see if it hits the target with the first item!
    • If we don’t luck out during the insert, then we iterate over items 2 to N, calculating the value we’d need to hit the target, and then doing a hash lookup to see if it’s there The hash lookup is O(1), and the pass over the array (including the load) is also O(1), so we’ve made a pretty good reduction in overall runtime.

    Things I’d forgotten in the last year of pretty much nothing but Scala:

    • The only loop structure in Go is the for loop.
    • var, not val. val is Scala, and as one tries for as much immutable data as possible in Scala, my fingers are trained to type val at every turn.
    • Go array initializer syntax. Square brackets, then the type, then the values in curly braces.
    • I remembered I needed to make() a map, but used Scala syntax instead of Go syntax trying to specify the types.
    • Go does not consider [2]int to be the same as []int. Fair.
    • Gotta have the parens for every function. Perl and Scala made me sloppy about that.

    Things discovered:

    • Adding the “try to catch it in the load pass” made a significant speed difference when the first term was at index 0.
    • Putting the first index in the array initializer for the result array instead of assigning it vastly increased the speed of the search loop — I went from the top 75% to the top 95% with that one change.
    • The hash solution uses more memory; I was at the 15th percentile on memory, but I’ll definitely take being faster than 95% of the other solutions as better.
    • I missed the “must be different indexes” restriction on my first pass; fortunately the second test case was for 6 with [3, 4, 2], and I got a test fail for [0,0].

    Here’s the final version. I do comment even when writing these because I can line out the code with comments and put the actual code itself in later. Writing down the assumptions helps clarify them as well (thanks loads, ADHD).

    func twoSum(nums []int, target int) []int {
        // Given that
        // - there is only one solution, then all entries must be unique
        // - therefore they can be used as keys to a hash
        // - this means we can iterate O(1) over the array for term 1
        //   and lookup term 2 in the hash, getting its index. No need
        //   for a linear search for it.
    
        // Fill the hash...and see if we can catch the solution
        // during this pass, since we have to do it anyway.
        locations := make(map[int]int)
        prematch := target - nums[0]
        for i := 0; i < len(nums); i++ {
            if i > 0 && nums[i] == prematch {
                return []int{0, i}
            }
            locations[nums[i]] = i
        }
    
        // scan the nums, looking for a match.
        // first match wins.
        for i := 0; i < len(nums); i++ {
            result := []int{i, -1}
            term2 := target - nums[i]
            j, ok := locations[term2]
            if ok {
                // Disqualify same-entry match. Missed this first time,
                // thank you test cases.
                if i == j {
                    continue
                }
                // Found
                result[1] = j
                return result
            }
        }
        // This is a "should not get here"; meant to show me if I've totally
        // blown the loop.
        panic("did not find a solution")
    }

    Add two numbers

    This one was recommended by leetcode as a “hot” problem, and it looked like fun, so I did it. It’s not in the Grind 75 list, but it’s a nice pointer manipulation problem, and I was a bit stale on pointers.

    The function is passed two linked lists; each list has an integer from 0-9 and a pointer to the next digit. The digits are presented in right-to-left order. The function is to take these two lists, and return a new list that represents the sum of the two numbers. Of course, the lists can be different lengths; if they are, then the list that runs out first is treated as if it were returning zeroes for the purposes of the sum.

    It was instantly clear that the challenge here would be maintaining the scan pointers for the two numbers and the result, and that trying to do bookkeeping for all of it in the loop would suck. So my approach was to construct a function to return a closure that would follow the rules for reading the numbers off the list:

    • If the scan pointer is null, return 0 and false
    • If the scan pointer is not null, return the node’s value and true, advancing the pointer to the next node.

    This meant that the list management was dead simple:

    first  := iterator(L1)
    second := iterator(L2)
    
    fVal, fState := first()
    sVal, sState := second()

    Each call to the first and second closures returns the next value to use in the sum; when both are false, the sum is complete.

    So all I had to manage in the loop was the scan pointer for the sum, and the carry from the addition operation: adding it in, clearing it, and recalculating it after the current operation.

    Things learned/relearned:

    • Had to remember when I needed * and when I didn’t for variable types.
    • The first run segfaulted. I realized that I’d mismanaged the scan pointer; I needed to save the root node, and then scan forward while doing the operations.
    • Once I fixed that, I found out I wasn’t clearing the carry. Whoops. Easy to fix.
    • The closures worked perfectly.
    • The “end the loop” and “restart the loop” keywords are break and continue. Trivial, but I had to go look it up.

    I did make one major mistake: I missed that the output was supposed to be another list, and started calculating the sum as an integer. This wasn’t too terribly off from the desired solution; I had figured out that I’d need the carry tracking, and I had a power variable that I was multiplying by 10 to switch columns in the output variable, but that probably wasted 5 or 10 minutes and might have made the difference between a pass and a fail in an interview.

    It was pretty easy to switch to the list building, but lesson hammered home again: be sure you read the problem statement correctly and know what the output is. Confirming with the interviewer that I got the details right is probably a good idea in a timed problem.

    **
     * Definition for singly-linked list.
     * type ListNode struct {
     *     Val int
     *     Next *ListNode
     * }
     */
    func addTwoNumbers(l1 *ListNode, l2 *ListNode) *ListNode {
        // The tricky bit with this one is going to be watching out
        // for the end of the list.
    
        // the dumbest and easiest way is to have a bool for each
        // list; when one list has no more entries, we set its bool
        // to false and return a zero value for it.
    
        // when both lists return a false value, we're done.
        first := iterator(l1)
        second := iterator(l2)
        var currentDigit *ListNode
    		var root *ListNode
    		carry := 0
    
        for {
            fVal, fState := first()
            sVal, sState := second()
    
            if (!fState && !sState) {
                // run off the end of both. Stop loop, finalize sum.
    						fmt.Println("Done")
                break
            }
            // At least one still returning a value. (The other returns 0
            // if there's no value.)
            // Sum the digits and the curent carry; if > 9,
            // take mod 10 and set the carry to 1. (no two
            // digits can sum to > 18).
            digitSum := fVal + sVal + carry
            carry = 0
            if digitSum > 9 {
                carry = 1
                digitSum = digitSum - 10
            }
            // Add space for a new digit, append it, continue.
            if currentDigit != nil {
                currentDigit.Next = &ListNode{digitSum, nil}
    		    currentDigit = currentDigit.Next
            } else {
    		    // Create and save root digit
                currentDigit = &ListNode{digitSum, nil}
    			root = currentDigit
            }
        }
        if (carry != 0) {
            // last addition had a carry we need to keep
            currentDigit.Next = &ListNode{carry, nil}
        }
        return root
    }
    
    func iterator(l *ListNode) func()(int, bool) {
        // Close over the pointer.
        p := l
        return func()(int, bool) {
            if p == nil {
                // Reached end. Keep returning 0 and false.
                return 0, false
            } else {
                // Capture next digit, advance pointer,
                // return next digit and true. If new pointer
                // is nil, we'll just return 0 and end-of-list signal
                // forever.
                v := p.Val
                p = p.Next 
                return v, true
            }
        }
    } 

    So how’d I do overall in comparison? Pretty fuckin’ good. I was in the top 95% on speed, and the top 91% in memory use. I suspect that managing all the bookkeeping in the loop might make it a tad faster (no call overhead for the closures), but at the cost of significantly more complexity. This solution is fast and small, and I’ll take it.

    Conclusions

    • My Go is pretty good. I didn’t spend a huge amount of time chasing bugs, and had it in a couple tries. I have a good grasp of the concepts and know mostly how to do things.
    • My Go is fairly stale. I don’t remember how to say things!
    • I had to look up a lot of things that I should have just remembered, like len() for array length (not .len, that’s Scala!). I need this practice!

    Tomorrow I’ll start off with Valid Parentheses.

  • Flexing the muscles: November leetcode challenge

    Most people do a novel for NaNoWriMo, and that’s a great thing.

    I am not a great fiction writer; maybe I’d be better if I practiced, but right now, writing code makes more money, so I’m going to spend the month of November working through the Grind 75 leetcode practice set.

    I would very much prefer a Perl job, but there just aren’t that many places now that really want someone whose primary is Perl. Python’s fine, but it’s mostly tied up in LLMs and AI at the moment, neither of which I actually find very interesting. (They seem to be more “build a model and hope it does the work you wanted” as opposed to “write the code to do the job you want done”, which I find much more satisfying.)

    I haven’t done much with Rust yet, and I think it’d be a fun language to work in, but I don’t see a lot of demand for it yet. Scala is fun to write but I’d rather jump out a window than use the JVM toolchain again. (This also crosses off Java, Dart, and Flutter for me as well.) I have not in general found C++ to be that much fun, and Javascript is what it is. I can write it, I just don’t enjoy it.

    So it comes down to two languages, really: Go and Swift. I enjoy coding in both (for different reasons), but at the moment I’m seeing more Go jobs than Swift ones, though that may just be my settings. Certainly when you need a Mac/iOS programmer, you need someone who at least understands Swift.

    So I’ll probably end up trying it with both, and seeing how it goes. Onward!

  • Via Medium: A step-by-step intro to Go concurrency

    I recent wrote a blog post on the Zip tech blog about Go concurrency; it’s mostly an intro to how channels and select both work, and how to use them effectively.

  • Followup on Go dependency Jenga

    I was finally able to build a working version of the glide.yaml file for my project and convert it to dep. The items of note:

    • To get openzipkin to work, I needed to specify
      [[constraint]]
        branch = "master"
        name = "github.com/opentracing/opentracing-go"
      
      [[override]]
        name = "github.com/openzipkin/zipkin-go-opentracing"
      
      
    • To get logrus to work, I had to change it to github.com/sirupsen/logrus in all my code and specify
      [[constraint]]
        name = "github.com/sirupsen/logrus"
        version = "^1.0.5"
      
      
  • Desperate times, desperate measures: dealing with a Go dependency Jenga tower

    Desperate times, desperate measures: dealing with a Go dependency Jenga tower

    TL;DR

    If you absolutely have to manually update your glide.lock file to add a specific SHA1 for a dependency and can’t do it right with glide update, edit glide.lock as needed, then:

    go get mattfarina/glide-hash
    glide hash

    This gets the correct checksum for your glide.lock file; update the hash: line at the top. You can now glide install without warnings.

    The detailed explanation

    Our microservices have a number of dependencies, one of which is logrus. Logrus is a great logging package, but was the trigger of a lot of issues last year when the repository was renamed from github.com/Sirupsen/logrus to github.com/sirupsen/logrus.

    That one capitalization change caused havoc in the Go community. If you don’t understand why, let’s talk a little about dependency management in Go. (If you do, skip down to “The detailed fix”).

    Go doesn’t have an official dependency management mechanism; when you build Go code, you pretty much expect to compile all the code it will need at once. Go goes have a linker, but generally we really do just build a single static binary from source files, including the source of libraries too. The Go maintainers decided that it’s simpler to store one set of source code to be pulled in and complied rather than store compiled libraries for multiple architectures and figure out which one needs to be pulled in. The Go compiler is pretty fast, and maintaining multiple native binary versions of libraries is hard.

    Originally, all source management was done with go get, which would fetch code from a VCS endpoint and put it in the appropriate place in the GOPATH (essentially the location where “stuff related to but not part of this Go program” lives) so that it could be picked up during a compile. This is super simple, but fails in a number of ways: a set of go get commands are a set of commands, and have to be run before the program can be built. This may not be reproducible (if someone makes a new commit to the library, the HEAD changes). Telling go get to fetch a specific version of a library is harder to do. go get is great at pulling a specific isolated library, but not good at managing transitive dependencies: e.g., we’ve installed library foo, but it needs library bar to perform some functions, and bar needs baz to do some of its work. We’d really like to see all of these figured out and installed at once, and to not have to remember what all the dependencies are, or to have to have a script to run to fetch them. We’re potentially running on multiple architectures, and we don’t want to have to maintain multiple executable scripts just to fetch our dependencies.

    Go’s first cut at solving this was the vendor directory. This directory lives in the same tree as the Go source and can be committed to the VCS, so one could get the required sources into the vendor library, then commit the “known-good” version. This works for the versioning problem, mostly, but means that it’s easy for many slightly different versions of those libraries to end up spread across multiple source code repositories, and keeping them synced up for fixes is difficult, and it doesn’t address the transitive issues at all. To fix this, the Go community built unofficial source management tools to handle versioned access to the vendor directory plus automated detection and resolution of transitive dependencies.

    The problem is that because the Go community is large, inventive, and active, we have a lot of them. We’ve already used two different tools: Godep and, currently, glide, and are probably going to switch to dep, which looks to eventually be the standard dependency management tool blessed by the Go core team. [Update: wrong again. go mod is the current official winner.]

    glide (our current tool, as noted) manages dependencies with two files: glide.yaml, which describes enough of the direct dependencies and their versions that all of the dependencies and their own transitive dependencies can be figured out. The glide.lock file stores the results of this dependency resolution as specific VCS commits (SHA1 hashes in the case of Git), allowing us to quickly fetch exactly what we want when getting ready to compile the code.

    Like any other piece of software, the glide files have to be kept up to date, especially if there are dependencies on outside libraries (from Github and the like) by periodically doing a glide update to update dependencies in the glide.lock file that aren’t locked to a specific version (or range of versions) by glide.yaml. If one falls behind on this, or a change such as the Sirupsen/logrus to sirupsen/logrus one happens, or you simply need to upgrade something to a new version, these files can end up in a state where a glide install still works, because this simply downloads the revisions dictated by glide.lock without attempting dependency resolution again, but glide update doesn’t, because the glide.yaml didn’t limit the possibilities enough, and attempting resolution of the dependencies fails.

    To fix this, we can do it one of two ways:

    1. The right way, which entails plodding through all the revisions until we’ve found a new set that works, fixed the glide.yaml file so that it defines that new set, and then used glide updateto download them and rewrite glide.lock. This can be excruciatingly difficult, as it’s possible that the updated glide.yaml will no longer resolve, or will resolve the dependencies in ways that won’t actually build, and there will have to be many update/download/compile cycles to actually fix the issue.
    2. The wrong way, which is to muck around with glide.lock directly, adding or changing something without making sure that glide.yaml “compiles” to the updated glide.lock. This gets us back on track with code that builds and runs, but leaves us in the dangerous situation that glide update is now broken.

    The detailed fix

    If you näively go the wrong way and just make changes to the glide.lock file, glide tries to be a good citizen and warn you that you’ve done something you ought not to:

    [WARN] Lock file may be out of date. Hash check of YAML failed. You may need to run 'update'

    appears when you glide install.

    As noted, the problem is that if you run glide update, you’ll break everything because you didn’t fix glide.yaml first. And maybe you just don’t have time to find the right incantation to get glide.yaml fixed just now.

    So, you lie to glide, as follows.

    1. Add the dependency to glide.yaml.
      • Edit glide.yaml and add the dependency plus its version if it has one. (Use master if you want to track HEAD or a specific SHA1 if you want to pin it to that commit.)

        - package: github.com/jszwec/csvutil
        version: 1.0.0
    2. Add the dependency to glide.lock.
      • This one must be the SHA1; the easiest way to get this is to go to the repository where it lives and copy it down. I won’t go into detail here, but however it works in your VCS, you’ll need the full SHA1 or revision marker.

        - name: github.com/jszwec/csvutil
        version: a9cea83f97294039c58703c4fe1937e57ea5eefc
    3. If we stopped at this point, we’d get a warning from glide install that would recommend that we use glide update instead to install the required libraries. In our case, with a delicate web of dependencies between local libraries and Echo, openzipkin and Apache Thrift, and the two different versions of logrus, a glide update breaks one or more of these dependencies when we try it. To prevent someone else from spending way too much time trying to resolve the problem by juggling versions in the glide.yaml in the hope of creating a stable glide.lock, we need to fix the computed file hash at the top of the glide.lock file so that the warning is suppressed.

      This is a hack! The best option is probably to import all the SHA1’s into the glide.yaml file as versions, ensure glide update works, and then gradually relax the constraints until glide update fails again, then back up one step.

      To calculate the hash, we can go get mattfarina/glide-hash, which creates a new glide hash subcommand that does exactly that and prints it on the console.

      We install the subcommand plugin as noted, then cd to the codebase where we need to fix the glide.lockfile. Once there, we simply issue glide hash, and the command prints the hash we need. Copy that, edit glide.lock, and replace the old hash on the first line with this new one.

      Warning!

      This is absolutely a stopgap solution. Sooner or later you’re going to need to update one or more of the libraries involved, and you really will want to do a glide update. Yes, you could keep updating this way, but it would be a lot better to solve the problem properly: go through all the dependencies, update the ones you need, and then make the necessary fixes so that your code and the library code are compatible again.