• 3 Posts
  • 64 Comments
Joined 1 year ago
cake
Cake day: July 8th, 2023

help-circle



  • Despite my love of yaml. I actually think he has a small point with unquoted strings. I teach students and see their struggles. Bash also does unquoted strings and basically all students go years and years without realizing

    cat --help
    cat "--help"
    # ^ same thing
    
    cat *
    cat "*"
    # ^ not same thing
    
    cat $thing
    cat "$thing"
    # ^ similar but not the same 
    

    To know the difference between special and normal-but-no-quotes you have to know literally every special symbol. And, for example, its rare to realize the -- in --help, isn’t special at a language level, its only special at a convention level.

    Same thing can happen in yaml files, but actually a little worse I’d say. In bash all the “special” things are at least symbols. But in yaml there are more special cases. Imagine editing this kind of a list:

    js_keywords:
    - if
    - else
    - while
    - break
    - continue
    - import
    - from
    - default
    - class
    - const
    - var
    - let
    - new
    - async
    - function
    - undefined
    - null
    - true
    - false
    - Nan
    - Infinity
    

    Three of those are not strings. Syntax highlighting can help (which is why I don’t think its a real issue). But still “why are three not strings? Well … just because”. AKA there isn’t a syntax pattern, there’s just a hardcoded list of names that need to be memorized. What is actually challeging is, unless students start with a proper yaml tutorial, or see examples of quotes in the config, its not obvious that quotes will solve the problem (students think "true" behaves like "\"true\""). So even when they see true is highlighted funny, they don’t really know what to do about it. I’ve seem some try stuff like \true.

    Still doesn’t mean yaml is bad, every language has edge cases.


  • Its easy for me to say “just start writing JSON in the yaml. It doesn’t get more simple than JSON”, but actually I do think there’s a small point with the unquoted strings.

    Back before I knew programming, I was trying to change grammar settings sublime 2, which uses yaml. I had no idea what yaml was. The default setting values used unquoted strings fot regex. I knew PCRE regex and escapes, but suddenly they didnt work, and when I tried to match a single quote inside of regex that also didn’t work. I didn’t know I was editing yaml file (it had a .tmLanguage extension). Even worse, if I remeber correctly, unparsable settings just silently fail. Not only did I have no errors to google, I didn’t have any reason to believe the escapes were the cause of the problem (they worked in the command line). Sometimes I edited the regex and it was fine, and other times it just seemed to break. I didn’t learn about quoting in YAML until years later.

    For me that was an unfortuate combination, which was exacerbated by yaml unquoted weirdness. But when you’re talking about “did you read the spec” that’s a whole other story. .nan for nan, tabs vs spaces, unquted string weirdness, etc should just be one error message+google away. I think they’re a small hiccups with what is overall a great format.



  • I have read the 1.2 spec (I’m trying to make a round trip parser for JS, and I do maintainance on a fork of the rumel yaml python package). I actually think its very well thought out, with things I hadn’t considered like future extensibility, streaming applications, and data-corruption detection.

    The diagrams, color coding, and less-formailty of the spec was much appreciated. Especially compared to something like the ECMA Script spec, which reads like a math textbook had a child with a legal document.

    I’m not saying YAML is perfect; round trip (the thing I’m working on) is nearly impossible because it wasn’t a design goal. It has a few too many features (I’ve never seen a declaration in the wild), but it does a good job at accomplishing the creators goals, and the additional features basically only slow down parser-implementers like me. I often pick it because of the tag support, which I’ve struggled to find an equivalent for in other serialization languages. I use anchors in recursive data structures, and complex keys for serializing complex data structures (not human readable). The “document end” marker has been nice when I’m worried about detecting partial-writes. And the merge key is nice for config files.

    The application/perspective matters. Yaml might be bad for you but its not bad for everyone.





  • It gets worse :/

    I looked up the brand (Invenda). Their PDF includes “using AI”, “measuring foot traffic”, and gathering “gender/age/etc” e.g. facial recognition to estimate a persons age and gender

    And in terms of “stored locally” this is straight from their website

    The machine comes with a “brain” – Invenda OS – and is connected to the Invenda Cloud, which allows you to manage it remotely and gather valuable environmental, consumer and transactional data. The device can be branded according to your requirements to further enhance your brand presence.

    The marketing also so fricken backwards that it reads like satire:

    For a consumer, there’s no greater comfort than shopping pressure-free. Invenda Wallet allows consumers to browse, select and pay for products leisurely and privately 🤦‍♂️





  • I work in the field of AI.

    “humans are excellent at ignoring” is something I like to tell students, because its computationally impossible for any intelligence (human or AI) to remember and process 24k-resolution-esk information every millisecond. Data must be thrown away, and humans are actually exceptional at it.

    If AI could ignore the correct things, we would already have AGI.

    Also search “invisible gorilla” on youtube if you haven’t already heard of the phrase.



  • You’ll basically have timezones either way, there’s just two ways of doing it.

    If we all used UTC, then businesses would need to change what time they opened depending on their location. Ex: Best Buy opening at 12 noon on the US west coast, and 3pm on the east coast. Locations inbetween would have different opening times. So we would get the noon zone, 1pm zone, 2pm zone, and 3pm zone. All nation wide businesses with standard open/close times would effectively follow the same pattern, and it would be best if they all coordinated on where those zones occured. So then we would get new timezones, they’d just be slightly different in how they functioned.