I observed that people who learned Python without type hints are fine gradually adding them. I initially learned Python in the 2.4 days circa 2005 but then didn’t use it for ~15 years, during which I learned many other languages. Notably, I spent eight years doing Scala and fell back in love with types. I’ve been back in the Python world for almost four years. I’ve not written a class or method without a type hint. I need them and TBH I struggle a bit to edit teammates’ code without them, so I end up adding them as I go.
I learned Python in the early 2000s, used it professionally 2012–2014 (2.x) and another company 2014–2016 (insisted on 3.x because Unicode, man, how do we live like this?!). Once I started dipping back into it last year for fun (my preferred HDL library, Amaranth, is written in it), I decided the era of types was definitely nigh and would annotate as much as possible.
I really dislike Python’s type hints, and I am otherwise very much into types. (Used Scala in anger plenty too!) Several times I’d give up because of the clunkiness of them, the noise they’d add to my code (and yet the imprecision I’m still faced with), and then months later would try again. And again. Finally I banished them from my projects.
If I was writing Python in a team, then I’d probably want to define some degree of type hinting to undertake as a rule, but the reality of using Python type hints for me, at least, is inevitable frustration.
(To make matters worse, Amaranth uses PEP 526 variable annotations for things other than type declarations, so you need to shush your type checker up whenever you use that feature of it. But even if this weren’t the case, the total unwieldiness is still far too great.)
the noise they’d add to my code (and yet the imprecision I’m still faced with)
PEP 526 variable annotations for things other than type declarations, so you need to shush your type checker up
I think I enumerated at least a couple? Maybe not with huge precision, but mostly because I didn’t feel like going back to old projects’ commit history to find all the issues I was facing, just to add factual weight to an anecdote that didn’t really rely on “factual weight” for its point.
This was a typical [tool.pyright] section in my pyproject.toml:
Look stupid? It is stupid! I’m importing things from an untyped library, so it’s either this or load up every imports section with # pyright: ignore[blah]. I started with zero of these and wanted to keep it that way, but nope; not to mention # pyright: ignore[reportPrivateUsage] when tests reach into things. I still need way more casts than is ergonomic. The actual type declarations would frequently give me pain — some generators are just impossible to type correctly (or in a way that doesn’t lose precision and lead to runtime overhead differentiating cases for the benefit of the types). An untyped library returning a context manager in a way that the type checker doesn’t understand leads to even more ignore comments littering my codebase. Generics with typing.TypeVar would often produce vexing results when combined with things like generators. You need to fall back to Protocol to express a callable with variadic or keyword arguments.
It looks like there’s some nice tools added in recent versions which make some parts of this less painful (Concatenate, ParamSpec), but it’s not pleasant. The value proposition does not add up for my use-cases.
Destructuring has its place. I don’t care much for the usage of itertools.count() in that example — seems ripe for an error related to failing to break while paging, but it’s fine — but the usage of enumerate and zip are normal to me.
You wouldn’t want someone to read this and try to shoehorn it into their next PR, but it’s good to know it’s possible for those niche cases where it’s actually the best way to write the code.
Switching my home DNS blocking from PiHole on DietPi on a Raspberry Pi 4 to PiHole on DietPi on Proxmox on some Acemagic Intel Atom N95 SBC I picked up, because PiHole’s reporting — that I use more frequently than I would like — seems to run poorly on SD eMMC storage and the SBC has an M.2 NVME drive. This new unit also has two Ethernet ports, so I’m experimenting with with a management interface and one dedicated to the VM on which PiHole is running. I’ve also never used Proxmox before so this is a good reason to check it out before swapping over. I’m also likely to play around with keepalived in the process so that I can keep the RPi4 in service as a fallback for when the Acemagic box needs a reboot.
Running more Ethernet in my new house. I’ve run around 565 feet of Cat6 UTP and Cat6A STP of the ~967 feet I’ve estimated. That leaves about 402 feet remaining spread over 7 runs. Unfortunately, the one I’m doing is only about 6 feet: it’s going in the same room as my rack. The others require a path taking them up from the basement to the attic through walls that run parallel: one is 10 feet from the exterior, one is 11 feet. The only potential crosspoint is a closet wall and I have no force powers to bullseye that womp rat, only an oscillating multitool and math. I’d welcome feedback on how best to accomplish this cheaply with minimal wall opening.
I’m moving my Home Assistant setup to the new house, too. I barely use it at my old house, but the house has too much more smart home stuff that is ripe for remote control.
It’s working out okay so far. Power draw is as advertised. I only really need it to run one or two small VMs at most. Disk also seems fast enough, at least faster than raspberry pi SD card! Obviously not hard to beat that. The form factor with the HDMI ports coming out of the opposite side of the USB ports was annoying for setup but once racked, power and ethernet ports are on the same side so it’ll be fine. I didn’t try the built-in ipxe booter, instead used a flash drive with netboot.xyz booter on it. Try a couple of different things on it and they all ran decently enough including Ubuntu Gnome.
Having 12 GB ram is pretty nice for VMs. I project that I would be able to run 3 3GB VMS with two cores each and reserve the remainder for the host. I think as long as the services in the VMS aren’t being hammered concurrently, it’ll all work out. But I’m only really planning to put phole on it!
TBH, I couldn’t get them to work quickly. Some dropdown with templates wouldn’t populate. I quickly moved to VMs and encountered a similar problem and found some commands I had to run to provision VM templates. I didn’t return to the LXC section although it seems I should!
To me, the SQL version is a step backward. I’ve done a few projects with Polars, including one with about 2k SLOC of Polars + Plotly with a hint of pandas because Plotly didn’t support Polars at the time (2022). That Polars code, while seemingly cumbersome for someone unfamiliar with the API, is well-written. It’s composable and testable. The pl.col("something") produce objects that can sit in variables or be built in a function. One example I vaguely remember from that 2022 project was a filter we put behind a method like so that the final selection was something like
We’d also put all of the columns we were working with into a class like
class Columns:
salary = pl.col('salary')
years_of_experience = pl.col('years_of_experience')
so we got nice completions and renames and usage tracking in our IDEs. You can’t get that (easily at least) with a SQL string, nor can you get the easy composability of functions and variables that Polars affords.
In the end, you gotta use what makes you productive. Polars’ API took some learning but it ended up being the right thing for our team working independently, asynchronously, and with varying skill levels.
I can’t agree more with everything you say said so well :)
The main benefit of Polars that I’ve seen is composability and reusability, something one can’t get from sql. The API is so well thought out: I worked for several years with Pandas and some things were just hard to get in the first try - with polars it just flows naturally.
Out of curiosity, how did you learn Polars? You’ve used it more than I have, probably by a couple orders of magnitude. I can muddle my way through writing some Polars code, but it doesn’t quite feel intuitive yet.
SQL reminds me of Latin or English: it’s old and has plenty of warts, but it’s the common language, and that’s huge. Actually, human languages are a good analogy. Why is there a * in COUNT(*)? Is it valid in other aggregate functions? I have no clue, in the same way I barely know what a gerund or infinitive is. But I’ve typed the characters COUNT(*) at every place I’ve worked over the last decade, and I’ve been using verbs for even longer. I don’t truly understand what I’m writing. But I can definitely tell you how many rows are in that table, or what the average salary is, or any other question you’d like to ask about the data.
Contrast with Polars. It’s clear even to me that pl.col("salary") must be a Python object. I can say that confidently, without worrying that I’m conjugating it incorrectly, or that it’s a propositional phrase masquerading as a Python object. No! It’s a Python object, full stop. The upside is I can build helper functions and classes, and everything works as I would expect. The downside (for me) is the speed/fluency penalty of not having all the idioms for manipulating Polars data.
I learned Polars the old fashioned way: editor, terminal, and docs open on the screen. Read the API docs for a method, use it, play around a bit, write a test, make code pass the test, and move to the next step. Every now and then, I’d poke around in some discussion groups. If I was really stuck, I’d find out how to do something in pandas and then I’d be more likely to find the terminology in the Polars docs or in some examples.
My constant problem with SQL is that I want to know types a lot of the time and I want composability. SQL doesn’t (easily) let you care about types and it’s not natively composable.
Probably the hardest part of moving from a SQL mindset to a DataFrame mindset is really understanding the tools in the toolbox. You have to play with them, see what they can do, and maybe get a little hurt misusing them sometime in order to learn proper use and application.
I’ve had the (mis)fortune of having had very little SQL in my career. I’ve always worked in document databases, pure CRUD APIs where the SQL is highly abstracted away, or DataFrame-based systems like Spark, Polars, and pandas. SQL of more than about 20 lines properly formatted makes my eyes bleed. I loved Quill when I was building a webapp in Scala, because I could write DataFrame-like Scala code and it would generate the SQL as a macro-time build and show you the effective SQL in your IDE.
It depends somewhat on the database. Standard SQL and Postgres are fairly strongly statically typed. MySQL and SQLite less so. The problem from a software engineering point of view is that the type declarations are in the table definitions, which are often not close at hand.
Indeed, thank you for that specificity. That’s precisely what I mean.
In my particular work area, it is extra steps for me to go look at the types of a table whenever I’m accessing Hive through spark SQL. I don’t control the table definitions or sometimes the tables are created in a completely different process that I do control but it’s not a part of the same code base. That’s an inherited architecture problem, an elephant that I’m slowly eating.
As mentioned in brief on that thread, happily there’s already an established fork on F-Droid called Syncthing-Fork. The maintainer has been careful to manage expectations about ongoing development when asked but it looks like a little extra contribution is trickling in. Perhaps this one will keep ticking along.
Right, that’s a large part of the problem. The author of the now-discontinued app fought unsuccessfully since February to get Google Play to accept new updates and failed. The latest update on that issue:
Nothing came of the discussions with google. Demands by Google for changes to get the permission granted were vague, which makes it both arduous to figure out how to address them and very unclear if whatever I do will actually lead to success. Then more unrelated work to stay on play came up (dev. verification, target API level), which among other influences finally made me realize I don’t have the motivation or time anymore to play this game.
/r/androiddev is full of warnings ranging from kvetching to “Google wrecked my six-figure business” (with its own dimension of rightfully and wrongfully from the reader and public’s point of view).
I’ve been an Android user for 16 years. I bought into Android with the T-Mobile G1 / HTC Dream. I wanted to build an app, but by the time I had the time in the early 2010s, it was clear to me that Google’s runaround was impeding people. By the time I got into a place in life where I could choose “stay a salaryman” versus “start a VC-backed app,” the warning flags were all over /r/androiddev and other communities on reddit and elsewhere. It’s probably not as bad as the worst cases make it out to be, but I’ve had enough friends try app companies and complain about Google and Apple store policies making it challenging if not impossible to do business.
One case was a local tour app. Apple refused their app for nine months. I used it. It was fine. Lawyers got involved, but they basically said, If you go this route, Apple will find every little reason to reject your every update if you win and they’ll probably just ban you for life if you lose.
Personally, I’d see the archiving as more questionable (but still ok in most cases) than showing a logged-in user the full post, which to me falls under the reader app being an unusually-shaped “user agent”.
Yeah, archiving feels icky to me, too. I appreciate the value of Internet Archive when the content disappeared from the web due to reasons other than author’s intent. The reason I’m developing this feature is that very often I found some cool guides, blog posts, tutorials, etc. simply disappear from the web, and it felt like a “book burning” event had happened. But I am not sure I should archive, really.
Why don’t you let your user do the archiving? I find a blog post o don’t wanna lose. I save the page locally. So now instead of browser, the saving is done by your reader app.
Plus you get to dedupe the stuff for many users. That’s kinda better since now you’re letting your app be more like a browser.
Do you mean literally provide a button archive this for me? Hmm, interesting idea. Similar to what Pinboard does (or did?). Thanks, I’ll think about this!
I subscribe to Pinboard almost entirely because of the archival feature. If I like something enough to pin it, I want to have a copy of it forever in the highest-fidelity representation at a reasonable cost (reasonable: print to PDF; unreasonable: a dedicated computer that teleports to me and, upon touching a button, powers on and displays that URL and only that URL).
Fight for the user. Do things that save the user time and energy without subtracting from the counterparty’s null experience based on user expectations. That is, save a click but don’t hammer an API.
Stringing wire fence for my dog yard, using a come-along I hand built.
For the longest time, I believed that math was reality. This home renovation has been a stark reminder that math simply describes reality. Reality simply is, and no matter how good your mass is, it’s still just a description and not real. I feel that pretty hard whenever I do cuts and holes based on math but then the reality doesn’t line up to my math! And I say this with years of math and years of construction under my belt.
The sad part is that most energy is consumed in the switching hardware between the server and client. There are huge shadownets spanning cities (with Wi-Fi and directional antennas) in Cuba, and even larger ad-hoc networks in Ukraine where even the switching is provided by the participants. This would be, in my mind, a much more consequential approach: Can you sustain a network, e.g. across a city, just with solar-powered components?
The sad part is that most energy is consumed in the switching hardware between the server and client
I wonder if this is true.
Based on eyeballing the allocation of space and power in homes and data centres, and the amount of kit needed for metropolitan-area nets, networking uses a lot less power than computing.
Some searchengineering suggests a typical cellular base station covers about 1000 users and uses about 1kW, and base stations are about 50% of the power usage of a cellular network, so that’s about 2W per user. Which is relatively large compared to the phone and the share of the server(s).
There was a nonprofit in my city called Metamesh that built a community wifi network like this, through a combination of grants and individual volunteers running their own nodes. It was pretty cool! Unfortunately it seems defunct now.
Meta Mesh founding president here, no longer on the board but still active where I can in promoting it. Pittmesh is essentially no more, as we couldn’t really find funding to maintain it or to build it out further. The non-profit organization pivoted to being a (Wi-Fi) internet service provider under the name Community Internet Solutions. it currently serves almost 150 customers but is going through some funding problems and could really use some donations while it’s awaiting some grant funding anticipated but not yet delivered.
Oh man, that’s too bad! I am always interested in mesh networking. There has been a resurgence of interest recently with LoRa-based Meshtastic, although that is much more bandwidth limited and will likely never serve websites. I’m building a few solar nodes this weekend to try to fill in coverage in our local mesh.
I don’t read much surprise. I like write-ups like this. They’re good for the two use cases that include just about everyone:
Eternal September; today’s 10,000 just learning something.
“People need to be reminded more than they need to be taught.” - Vonnegut.
I was fortunate to have a nice lesson in this while I was in college, which I then exploited for years. My undergrad school didn’t “allow” Macs, primarily because the school IT didn’t have any budget for an expert and none of the school’s security software or remote access stuff had Mac OS X versions circa 2004–2007. I knew about a workaround that would let someone make a single curl request to a web service at the end of the onboarding process, a CD-based installer that had hundreds of MB of cruft on it (antivirus, anti-malware, remote drive mounters, an offline copy of the troubleshooting guide that IT would walk through when you called for help, etc.). That web request had a MAC address in the parameters. Nothing else in there mattered, none was checked ever. That MAC address was added to a list that got an “authenticated” IP address from DHCP where anything not in that list got another range in a different VLAN. I got probably two dozen Macs per year and a few Linux people connect with my magical command line skills and knowledge of the API.
…I maintained during that time the installer software under the direction of one of the heads of IT, so I guess you could say it was insider knowledge! But somebody else wrote the API 😉
In my experience, TDD is a great way to build up in a way that limits unnecessary things while enabling removing things once thought necessary in pursuit of achieving some reasonable test coverage goal.
YAGNI is good. One thing about YAGNI that I call out is folks’ tendency to prefer 1:1 to 1:M/M:1 in their design. Just this week, a value once thought only ever to be binary became trinary with a proposed fourth case. Of course, we’re working in Python so going from a boolean to an Enum is pretty easy for the binary case, fixing type hints while the previously hardcoded true/false becomes
class ThingWanted(Enum)
Yep = true
Nope = false
Then incrementally adding the third Enum value (Maybe?) to use wherever. A few weeks ago, my team incurred a week-long degradation and partial outage of a key service because someone decided to go from a 1:1 relationship to a 1:M relationship and too many things outside of the tested area couldn’t handle the switch from a single thing to a list of things. A junior and mid-career dev learned why we don’t change types of an existing member in a live distributed system! Not using a collection from the start saved 2 bytes per usage or about 20 bytes per request at something like maybe 20 MiB per day given the request volume. With 20/20 hindsight, I’d say that wasted ~14.25 GiB over ~2 years would have been a worthwhile expense to avoid this downtime that had probably $10,000 per hour worth of people on sub-daily status calls until it was fixed. But that’s another YAGNI problem ;-)
The page you have tried to access is not available because the owner of the file you are trying to access has exceeded our short term bandwidth limits. Please try again shortly.
Details:
451 Actioning this file would cause “jbkempf.com//blog/2024/ffmpeg-7.1.0/” to exceed the per-day file actions limit of 160000 actions, try again later
Thanks for that call out. I’ve always thought that link was an action that would hide a story from view in a different way than the hide link in the same bar.
This is a pretty good article on how to be gregarious in interviews, if what the company is looking for is a high level of gregarity. Many people like to work with sociable coworkers, so this advice is great in that category.
The efficacy of the recommendations is reduced by the relation to Canonical. I’ve never heard a positive opinion of that company’s hiring process, neither from people who have succeeded or failed, and inclusive of present and former employees. At a recent tech event where names were named of companies doing good and bad in hiring, only Canonical was brought up more often than a local banking conglomerate as having a more confusing and frustrating hiring process. I’m consistently frustrated by how often Canonical comes up as an example of convoluted interviews, especially for a company I have otherwise loved and supported — even financially — since its inception.
I’m consistently frustrated by how often Canonical comes up as an example of convoluted interviews, especially for a company I have otherwise loved and supported — even financially — since its inception.
Indeed; this is a bit of a tangent, but I feel similarly about Canonical in general. I was super into them when they hit the ground in 2004, was the first LoCo team contact in my country in 2006, handed out disks on Software Freedom Day, and hoped my first proper programming job might be with them. (I was in high school at the time.)
It didn’t take long for my idealism to peter out, meanwhile a friend from high school (with whom I shared that enthusiasm!) got a job with them in 2010 or so, and has been working for them ever since. I’ve never heard from him in an “open source context” or seen any of his work likewise — brushing shoulders in a mailing list or a pull request or whatever — because AFAICT in the 15 or so years since he’s mainly worked on Launchpad and the Snap store, which themselves are so .. Canonical in their design and ethos. It makes me kind of sad.
Something I’ve found useful is using the Proxy-Auto-Detect directive in combination with a script that determines if my machine is on a network I control that has a proxy configured.
My example is here. When making requests, Apt runs that script, which uses Netcat with a short timeout to see if http://proxy:3128 resolves and connects.
I’m genuinely disappointed by this news. AnandTech was one of the sites I spent a lot of time on in the 2000s. Its method informed how I wrote as a tech journalist on the side for several years. Not many of the OGs remain.
I wrote for ThinkComputers.org 2006–2014 (general tech reviews) and cofounded and ran BIOSLEVEL.com 2007–2010 (Linux-focused tech reviews; cofounder and I had a falling out and I pulled all of my content but I still have one acknowledgement in an article from 2008). There are so many in the space now and I never really know who to trust of the newer ones until someone from one of the OGs I know says someone’s good.
Astral is also building Rye. When should one use Rye versus uv?
I’ve been a Poetry devotee for a while but the promise of being able to handle Python installation is incredible attractive and drastically reduces the moving parts of my typical setup.
When I worked for a data analytics company, I once pitched to my management the idea that we’d charge for a separate “CSV plugin” or “CSV processing” because we found that
problems with malformed CSV ingested accounted for about 75% of support cases that made it to the developers, and
processing well-formed CSV, in general, took something like 35% longer than Avro, the format to which we translated CSV immediately after ingestion. That parsing and maintaining logic for exceptions had a lot of development and operations overhead!
I don’t know if they ever implemented the idea. The organization was dysfunctional and still trying to become a product organization from its roots in consulting services, a.k.a. doing whatever made the customer happy, including suffering mistakes from manual CSV output.
I observed that people who learned Python without type hints are fine gradually adding them. I initially learned Python in the 2.4 days circa 2005 but then didn’t use it for ~15 years, during which I learned many other languages. Notably, I spent eight years doing Scala and fell back in love with types. I’ve been back in the Python world for almost four years. I’ve not written a class or method without a type hint. I need them and TBH I struggle a bit to edit teammates’ code without them, so I end up adding them as I go.
I learned Python in the early 2000s, used it professionally 2012–2014 (2.x) and another company 2014–2016 (insisted on 3.x because Unicode, man, how do we live like this?!). Once I started dipping back into it last year for fun (my preferred HDL library, Amaranth, is written in it), I decided the era of types was definitely nigh and would annotate as much as possible.
I really dislike Python’s type hints, and I am otherwise very much into types. (Used Scala in anger plenty too!) Several times I’d give up because of the clunkiness of them, the noise they’d add to my code (and yet the imprecision I’m still faced with), and then months later would try again. And again. Finally I banished them from my projects.
If I was writing Python in a team, then I’d probably want to define some degree of type hinting to undertake as a rule, but the reality of using Python type hints for me, at least, is inevitable frustration.
(To make matters worse, Amaranth uses PEP 526 variable annotations for things other than type declarations, so you need to shush your type checker up whenever you use that feature of it. But even if this weren’t the case, the total unwieldiness is still far too great.)
Why do you dislike them? You didn’t enumerate a single reason.
I think I enumerated at least a couple? Maybe not with huge precision, but mostly because I didn’t feel like going back to old projects’ commit history to find all the issues I was facing, just to add factual weight to an anecdote that didn’t really rely on “factual weight” for its point.
This was a typical
[tool.pyright]
section in mypyproject.toml
:Look stupid? It is stupid! I’m importing things from an untyped library, so it’s either this or load up every imports section with
# pyright: ignore[blah]
. I started with zero of these and wanted to keep it that way, but nope; not to mention# pyright: ignore[reportPrivateUsage]
when tests reach into things. I still need way morecast
s than is ergonomic. The actual type declarations would frequently give me pain — some generators are just impossible to type correctly (or in a way that doesn’t lose precision and lead to runtime overhead differentiating cases for the benefit of the types). An untyped library returning a context manager in a way that the type checker doesn’t understand leads to even moreignore
comments littering my codebase. Generics withtyping.TypeVar
would often produce vexing results when combined with things like generators. You need to fall back toProtocol
to express a callable with variadic or keyword arguments.It looks like there’s some nice tools added in recent versions which make some parts of this less painful (
Concatenate
,ParamSpec
), but it’s not pleasant. The value proposition does not add up for my use-cases.This feels like a “Cute. Now don’t do that,” sort of thing.
Destructuring has its place. I don’t care much for the usage of
itertools.count()
in that example — seems ripe for an error related to failing to break while paging, but it’s fine — but the usage of enumerate and zip are normal to me.You wouldn’t want someone to read this and try to shoehorn it into their next PR, but it’s good to know it’s possible for those niche cases where it’s actually the best way to write the code.
Help others figure out what to use when they’re working in a new language in a way that’s not AI.
https://codethesaur.us/
Switching my home DNS blocking from PiHole on DietPi on a Raspberry Pi 4 to PiHole on DietPi on Proxmox on some Acemagic Intel Atom N95 SBC I picked up, because PiHole’s reporting — that I use more frequently than I would like — seems to run poorly on SD eMMC storage and the SBC has an M.2 NVME drive. This new unit also has two Ethernet ports, so I’m experimenting with with a management interface and one dedicated to the VM on which PiHole is running. I’ve also never used Proxmox before so this is a good reason to check it out before swapping over. I’m also likely to play around with
keepalived
in the process so that I can keep the RPi4 in service as a fallback for when the Acemagic box needs a reboot.Running more Ethernet in my new house. I’ve run around 565 feet of Cat6 UTP and Cat6A STP of the ~967 feet I’ve estimated. That leaves about 402 feet remaining spread over 7 runs. Unfortunately, the one I’m doing is only about 6 feet: it’s going in the same room as my rack. The others require a path taking them up from the basement to the attic through walls that run parallel: one is 10 feet from the exterior, one is 11 feet. The only potential crosspoint is a closet wall and I have no force powers to bullseye that womp rat, only an oscillating multitool and math. I’d welcome feedback on how best to accomplish this cheaply with minimal wall opening.
I’m moving my Home Assistant setup to the new house, too. I barely use it at my old house, but the house has
too muchmore smart home stuff that is ripe for remote control.Also looking into the mini PC options for Proxmox, how are you finding the Acemagic box?
It’s working out okay so far. Power draw is as advertised. I only really need it to run one or two small VMs at most. Disk also seems fast enough, at least faster than raspberry pi SD card! Obviously not hard to beat that. The form factor with the HDMI ports coming out of the opposite side of the USB ports was annoying for setup but once racked, power and ethernet ports are on the same side so it’ll be fine. I didn’t try the built-in ipxe booter, instead used a flash drive with netboot.xyz booter on it. Try a couple of different things on it and they all ran decently enough including Ubuntu Gnome.
Having 12 GB ram is pretty nice for VMs. I project that I would be able to run 3 3GB VMS with two cores each and reserve the remainder for the host. I think as long as the services in the VMS aren’t being hammered concurrently, it’ll all work out. But I’m only really planning to put phole on it!
If you’re using proxmox, why not use LXCs? I’ve found them lighter in memory use on my proxmox setup.
TBH, I couldn’t get them to work quickly. Some dropdown with templates wouldn’t populate. I quickly moved to VMs and encountered a similar problem and found some commands I had to run to provision VM templates. I didn’t return to the LXC section although it seems I should!
To me, the SQL version is a step backward. I’ve done a few projects with Polars, including one with about 2k SLOC of Polars + Plotly with a hint of pandas because Plotly didn’t support Polars at the time (2022). That Polars code, while seemingly cumbersome for someone unfamiliar with the API, is well-written. It’s composable and testable. The
pl.col("something")
produce objects that can sit in variables or be built in a function. One example I vaguely remember from that 2022 project was a filter we put behind a method like so that the final selection was something likeWe’d also put all of the columns we were working with into a class like
so we got nice completions and renames and usage tracking in our IDEs. You can’t get that (easily at least) with a SQL string, nor can you get the easy composability of functions and variables that Polars affords.
In the end, you gotta use what makes you productive. Polars’ API took some learning but it ended up being the right thing for our team working independently, asynchronously, and with varying skill levels.
I can’t agree more with everything you say said so well :)
The main benefit of Polars that I’ve seen is composability and reusability, something one can’t get from sql. The API is so well thought out: I worked for several years with Pandas and some things were just hard to get in the first try - with polars it just flows naturally.
Out of curiosity, how did you learn Polars? You’ve used it more than I have, probably by a couple orders of magnitude. I can muddle my way through writing some Polars code, but it doesn’t quite feel intuitive yet.
SQL reminds me of Latin or English: it’s old and has plenty of warts, but it’s the common language, and that’s huge. Actually, human languages are a good analogy. Why is there a
*
inCOUNT(*)
? Is it valid in other aggregate functions? I have no clue, in the same way I barely know what a gerund or infinitive is. But I’ve typed the charactersCOUNT(*)
at every place I’ve worked over the last decade, and I’ve been using verbs for even longer. I don’t truly understand what I’m writing. But I can definitely tell you how many rows are in that table, or what the average salary is, or any other question you’d like to ask about the data.Contrast with Polars. It’s clear even to me that
pl.col("salary")
must be a Python object. I can say that confidently, without worrying that I’m conjugating it incorrectly, or that it’s a propositional phrase masquerading as a Python object. No! It’s a Python object, full stop. The upside is I can build helper functions and classes, and everything works as I would expect. The downside (for me) is the speed/fluency penalty of not having all the idioms for manipulating Polars data.I learned Polars the old fashioned way: editor, terminal, and docs open on the screen. Read the API docs for a method, use it, play around a bit, write a test, make code pass the test, and move to the next step. Every now and then, I’d poke around in some discussion groups. If I was really stuck, I’d find out how to do something in pandas and then I’d be more likely to find the terminology in the Polars docs or in some examples.
My constant problem with SQL is that I want to know types a lot of the time and I want composability. SQL doesn’t (easily) let you care about types and it’s not natively composable.
Probably the hardest part of moving from a SQL mindset to a DataFrame mindset is really understanding the tools in the toolbox. You have to play with them, see what they can do, and maybe get a little hurt misusing them sometime in order to learn proper use and application.
I’ve had the (mis)fortune of having had very little SQL in my career. I’ve always worked in document databases, pure CRUD APIs where the SQL is highly abstracted away, or DataFrame-based systems like Spark, Polars, and pandas. SQL of more than about 20 lines properly formatted makes my eyes bleed. I loved Quill when I was building a webapp in Scala, because I could write DataFrame-like Scala code and it would generate the SQL as a macro-time build and show you the effective SQL in your IDE.
It depends somewhat on the database. Standard SQL and Postgres are fairly strongly statically typed. MySQL and SQLite less so. The problem from a software engineering point of view is that the type declarations are in the table definitions, which are often not close at hand.
Indeed, thank you for that specificity. That’s precisely what I mean.
In my particular work area, it is extra steps for me to go look at the types of a table whenever I’m accessing Hive through spark SQL. I don’t control the table definitions or sometimes the tables are created in a completely different process that I do control but it’s not a part of the same code base. That’s an inherited architecture problem, an elephant that I’m slowly eating.
As mentioned in brief on that thread, happily there’s already an established fork on F-Droid called Syncthing-Fork. The maintainer has been careful to manage expectations about ongoing development when asked but it looks like a little extra contribution is trickling in. Perhaps this one will keep ticking along.
Yes, unfortunately it does seem to be F-Droid only – not available on the Play Store.
Right, that’s a large part of the problem. The author of the now-discontinued app fought unsuccessfully since February to get Google Play to accept new updates and failed. The latest update on that issue:
It is past time people start complain about Google/Play Store, not the apps and its creators/maintainers.
/r/androiddev is full of warnings ranging from kvetching to “Google wrecked my six-figure business” (with its own dimension of rightfully and wrongfully from the reader and public’s point of view).
I’ve been an Android user for 16 years. I bought into Android with the T-Mobile G1 / HTC Dream. I wanted to build an app, but by the time I had the time in the early 2010s, it was clear to me that Google’s runaround was impeding people. By the time I got into a place in life where I could choose “stay a salaryman” versus “start a VC-backed app,” the warning flags were all over /r/androiddev and other communities on reddit and elsewhere. It’s probably not as bad as the worst cases make it out to be, but I’ve had enough friends try app companies and complain about Google and Apple store policies making it challenging if not impossible to do business.
One case was a local tour app. Apple refused their app for nine months. I used it. It was fine. Lawyers got involved, but they basically said, If you go this route, Apple will find every little reason to reject your every update if you win and they’ll probably just ban you for life if you lose.
Personally, I’d see the archiving as more questionable (but still ok in most cases) than showing a logged-in user the full post, which to me falls under the reader app being an unusually-shaped “user agent”.
Yeah, archiving feels icky to me, too. I appreciate the value of Internet Archive when the content disappeared from the web due to reasons other than author’s intent. The reason I’m developing this feature is that very often I found some cool guides, blog posts, tutorials, etc. simply disappear from the web, and it felt like a “book burning” event had happened. But I am not sure I should archive, really.
Why don’t you let your user do the archiving? I find a blog post o don’t wanna lose. I save the page locally. So now instead of browser, the saving is done by your reader app.
Plus you get to dedupe the stuff for many users. That’s kinda better since now you’re letting your app be more like a browser.
Do you mean literally provide a button
archive this for me
? Hmm, interesting idea. Similar to what Pinboard does (or did?). Thanks, I’ll think about this!I subscribe to Pinboard almost entirely because of the archival feature. If I like something enough to pin it, I want to have a copy of it forever in the highest-fidelity representation at a reasonable cost (reasonable: print to PDF; unreasonable: a dedicated computer that teleports to me and, upon touching a button, powers on and displays that URL and only that URL).
Fight for the user. Do things that save the user time and energy without subtracting from the counterparty’s null experience based on user expectations. That is, save a click but don’t hammer an API.
Stringing wire fence for my dog yard, using a come-along I hand built.
For the longest time, I believed that math was reality. This home renovation has been a stark reminder that math simply describes reality. Reality simply is, and no matter how good your mass is, it’s still just a description and not real. I feel that pretty hard whenever I do cuts and holes based on math but then the reality doesn’t line up to my math! And I say this with years of math and years of construction under my belt.
The sad part is that most energy is consumed in the switching hardware between the server and client. There are huge shadownets spanning cities (with Wi-Fi and directional antennas) in Cuba, and even larger ad-hoc networks in Ukraine where even the switching is provided by the participants. This would be, in my mind, a much more consequential approach: Can you sustain a network, e.g. across a city, just with solar-powered components?
I wonder if this is true.
Based on eyeballing the allocation of space and power in homes and data centres, and the amount of kit needed for metropolitan-area nets, networking uses a lot less power than computing.
Some searchengineering suggests a typical cellular base station covers about 1000 users and uses about 1kW, and base stations are about 50% of the power usage of a cellular network, so that’s about 2W per user. Which is relatively large compared to the phone and the share of the server(s).
So, hmm, dunno.
There was a nonprofit in my city called Metamesh that built a community wifi network like this, through a combination of grants and individual volunteers running their own nodes. It was pretty cool! Unfortunately it seems defunct now.
Meta Mesh founding president here, no longer on the board but still active where I can in promoting it. Pittmesh is essentially no more, as we couldn’t really find funding to maintain it or to build it out further. The non-profit organization pivoted to being a (Wi-Fi) internet service provider under the name Community Internet Solutions. it currently serves almost 150 customers but is going through some funding problems and could really use some donations while it’s awaiting some grant funding anticipated but not yet delivered.
Oh man, that’s too bad! I am always interested in mesh networking. There has been a resurgence of interest recently with LoRa-based Meshtastic, although that is much more bandwidth limited and will likely never serve websites. I’m building a few solar nodes this weekend to try to fill in coverage in our local mesh.
seriously? Why the surprise?
I don’t read much surprise. I like write-ups like this. They’re good for the two use cases that include just about everyone:
I was fortunate to have a nice lesson in this while I was in college, which I then exploited for years. My undergrad school didn’t “allow” Macs, primarily because the school IT didn’t have any budget for an expert and none of the school’s security software or remote access stuff had Mac OS X versions circa 2004–2007. I knew about a workaround that would let someone make a single curl request to a web service at the end of the onboarding process, a CD-based installer that had hundreds of MB of cruft on it (antivirus, anti-malware, remote drive mounters, an offline copy of the troubleshooting guide that IT would walk through when you called for help, etc.). That web request had a MAC address in the parameters. Nothing else in there mattered, none was checked ever. That MAC address was added to a list that got an “authenticated” IP address from DHCP where anything not in that list got another range in a different VLAN. I got probably two dozen Macs per year and a few Linux people connect with my magical command line skills and knowledge of the API.
…I maintained during that time the installer software under the direction of one of the heads of IT, so I guess you could say it was insider knowledge! But somebody else wrote the API 😉
In my experience, TDD is a great way to build up in a way that limits unnecessary things while enabling removing things once thought necessary in pursuit of achieving some reasonable test coverage goal.
YAGNI is good. One thing about YAGNI that I call out is folks’ tendency to prefer 1:1 to 1:M/M:1 in their design. Just this week, a value once thought only ever to be binary became trinary with a proposed fourth case. Of course, we’re working in Python so going from a boolean to an Enum is pretty easy for the binary case, fixing type hints while the previously hardcoded true/false becomes
Then incrementally adding the third Enum value (
Maybe
?) to use wherever. A few weeks ago, my team incurred a week-long degradation and partial outage of a key service because someone decided to go from a 1:1 relationship to a 1:M relationship and too many things outside of the tested area couldn’t handle the switch from a single thing to a list of things. A junior and mid-career dev learned why we don’t change types of an existing member in a live distributed system! Not using a collection from the start saved 2 bytes per usage or about 20 bytes per request at something like maybe 20 MiB per day given the request volume. With 20/20 hindsight, I’d say that wasted ~14.25 GiB over ~2 years would have been a worthwhile expense to avoid this downtime that had probably $10,000 per hour worth of people on sub-daily status calls until it was fixed. But that’s another YAGNI problem ;-)Getting a “bandwidth restricted” error:
https://web.archive.org/web/20240930110551/https://jbkempf.com/blog/2024/ffmpeg-7.1.0/ works
This is true and useful, but also note the “archive” options by the title, before the comment count.
Thanks for that call out. I’ve always thought that link was an action that would hide a story from view in a different way than the hide link in the same bar.
Oh, good point. Do you think “Mirrors” or “Caches” would be more clear?
Yes, either would be an improvement. Slight preference for caches over mirrors in this context.
I think even “Archives” would be better than “Archive,” because the former won’t be mistaken for an imperative verb.
I opened a meta discussion about the archive link.
Amazing and terrifying.
I’m scared by how much I like this.
Magnificent
I always enjoy reading these!
This is a pretty good article on how to be gregarious in interviews, if what the company is looking for is a high level of gregarity. Many people like to work with sociable coworkers, so this advice is great in that category.
The efficacy of the recommendations is reduced by the relation to Canonical. I’ve never heard a positive opinion of that company’s hiring process, neither from people who have succeeded or failed, and inclusive of present and former employees. At a recent tech event where names were named of companies doing good and bad in hiring, only Canonical was brought up more often than a local banking conglomerate as having a more confusing and frustrating hiring process. I’m consistently frustrated by how often Canonical comes up as an example of convoluted interviews, especially for a company I have otherwise loved and supported — even financially — since its inception.
Indeed; this is a bit of a tangent, but I feel similarly about Canonical in general. I was super into them when they hit the ground in 2004, was the first LoCo team contact in my country in 2006, handed out disks on Software Freedom Day, and hoped my first proper programming job might be with them. (I was in high school at the time.)
It didn’t take long for my idealism to peter out, meanwhile a friend from high school (with whom I shared that enthusiasm!) got a job with them in 2010 or so, and has been working for them ever since. I’ve never heard from him in an “open source context” or seen any of his work likewise — brushing shoulders in a mailing list or a pull request or whatever — because AFAICT in the 15 or so years since he’s mainly worked on Launchpad and the Snap store, which themselves are so .. Canonical in their design and ethos. It makes me kind of sad.
Something I’ve found useful is using the
Proxy-Auto-Detect
directive in combination with a script that determines if my machine is on a network I control that has a proxy configured.My example is here. When making requests, Apt runs that script, which uses Netcat with a short timeout to see if http://proxy:3128 resolves and connects.
I’m genuinely disappointed by this news. AnandTech was one of the sites I spent a lot of time on in the 2000s. Its method informed how I wrote as a tech journalist on the side for several years. Not many of the OGs remain.
I wrote for ThinkComputers.org 2006–2014 (general tech reviews) and cofounded and ran BIOSLEVEL.com 2007–2010 (Linux-focused tech reviews; cofounder and I had a falling out and I pulled all of my content but I still have one acknowledgement in an article from 2008). There are so many in the space now and I never really know who to trust of the newer ones until someone from one of the OGs I know says someone’s good.
Astral is also building Rye. When should one use Rye versus uv?
I’ve been a Poetry devotee for a while but the promise of being able to handle Python installation is incredible attractive and drastically reduces the moving parts of my typical setup.
I found this: https://github.com/astral-sh/rye/discussions/1164
The impression I got is that uv is effectively replacing rye.
When I worked for a data analytics company, I once pitched to my management the idea that we’d charge for a separate “CSV plugin” or “CSV processing” because we found that
I don’t know if they ever implemented the idea. The organization was dysfunctional and still trying to become a product organization from its roots in consulting services, a.k.a. doing whatever made the customer happy, including suffering mistakes from manual CSV output.