Category Archives: spam

new regex stuff!

logical operators! thanks ian! 😉

+ () [] - |

(stuff that remains the same)+(stuff that changes) – otherwise known as “capture groups”

[89] = 8 or 9

[0-4] = 0, 1, 2, 3, or 4

| = logical OR

so…

\D(85\.157\.47\.)+(12[89]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])\D

means “capture everything in 85.157.47.128/25”

which, up until now, has meant “make a separate rule for every IP address between 85.157.47.128 and 85.157.47.255” — 128 SEPARATE RULES, which takes A LONG time, and slows down processing speed.

this is a BIG step forward!

WOO!!! 😎👍

ETA 200205: even more WOO!!! because ian directed me to a RegEx Numeric Range Generator, which means that i don’t have to figure them all out myself! WOO!!! 😎👍

calm, still no storm… weird…

still calm, still a few “false positives” which are easily dealt with, and forwardable almost immediately… ‼👍 but no “bitcoin sextortion” spam since 191202… and the record is currently held by 1LfYcbCsssB2niF3VWRBTVZFExzsweyPGQ, who i last heard from on 191127, who spammed me four hundred eighty-seven times

spam assassin has, apparently, figured out a regex (or something) for capturing bitcoin addresses, so after 191127, there have been no bitcoin sextortion spams that have NOT been labeled as ***SPAM*** by spam assassin, which makes them a lot easier to filter out.

but it’s weird, because, even though it has been almost a week now, waking up in the morning and NOT having two or three DOZEN spam messages to process makes me nervous that something else may be happening to all of those messages, and, potentially, legitimate messages, as well, and i have no clue what may be happening to them, because nobody other than me is even aware of the fact that they’re not there any longer. 😕

we started the panto. it’s Jack and The Beanstalk… i don’t remember whether this is the first panto we did, or the second panto we did, way back when we first started doing pantos, 17 years or so ago… but it’s largely the same script: different actors but the same characters… and no simon, but he hasn’t been involved since he got drunk, did something which he wasn’t supposed to (sexual harrassment? stealing stuff? something that only drunk people do… 😒), and was banned from the palladium, a few years ago. we did the first four of 20 performances, last weekend, and only missed one music cue: half the band started a half a measure before the other half of the band, and none of the singers came in at the right point, but we recognised it almost immediately, and kiki said “wait, can we start that again?” and everybody came in on cue when we tried it again… and there was one place at the end of the panto, where the giant chops down the beanstalk, and the ogress (represented by a puppet) falls from the castle in the sky, along a zip-line, to the back of the palladium. but, this time, the doors to the castle opened, but no ogress came out… so we just continued, where the ogress (this time, the real actor) then “falls” back up to the front of the stage, and has a few lines… and then the puppet ogress decided it was time to fall… 🤣

but, all in all, the panto is going well.

calm… i hope no storm…

the past three full days now, i have gotten SIGNIFICANTLY less spam than normal… like, normally i’ll get anywhere from two to six DOZEN spam messages a day, and, since saturday, i have gotten, maybe two dozen total

i’ve been blocking ranges of IP addresses in argentina and peru and china and india and denmark and kazakhstan and iran and lithuania and brazil and germany and LOTS of ranges for russia, and luxembourg and vietnam and turkey and indonesia and romania and the UK and georgia (the country, not the state in the united states), and nigeria and egypt and cambodia and myanmar (and that’s only up to the 45.0.0.0/8 range) like a mad fiend, for about two months prior to saturday… and all of those places are places from which i have never received email that was not spam…

literally, i’ve been blocking JUST ranges connected with the 1LfYcbCsssB2niF3VWRBTVZFExzsweyPGQ “bitcoin porn sextortion” scam since october 4th. 🤬

maybe i’ve finally caught up with the script. i’ve got 1,043 filter rules, and a fair portion of them are IP ranges…

but it feels weird… nobody has complained that they’re not getting important emails, and the false positives that have been coming through are usually either dealt with by changing “contains” to “matches regex”, or by deleting rules that i don’t need any longer… like the one for the .mp TLD, which was giving me false positives all the time because of mailchi.mp, which, while spammy, is not universally spammy, and, as far as i can tell, is the only NON-spammy use of the .mp TLD… but i decided that, instead of figuring out how to rule out legitimate use of a spammy TLD, i just started banning the countries that the spam was coming from…

but it feels weird… i’ve been on edge for a couple of days now, and i’m pretty sure it’s directly related to my relationship with the computer and the ‘net… 😒

but not entirely related… i had a pair of blue sunglasses that i got before i went to oregon to busk, a few months ago, and i lost them about a week ago. since then i’ve been losing a whole bunch of other things — keys, tools, credit cards, that sort of thing — and i’ve been finding them again, usually in the same day, sometimes within the same 15 minutes or so… but i haven’t been able to find my sunglasses, and it PISSES ME OFF because the reason i got them, primarily, was to help aleviate some of my depression, and they have worked ADMIRABLY for that purpose… and i remember thinking, if i put them… wherever it was that i put them… 😕 and left them there for too long, i would probably not remember where they were, the next time i looked for them… 😒

it’s possible that they’re somewhere around the house, but i’ve looked at least three times in every place i can think of, and quite a few that i couldn’t have thought of in a long time, and have nothing to show for it except a much cleaner house. they’re not in the car, as far as i can tell, nor are they in my tuba case, or my tuba bag.

moe is going away for a few days — travelling for stuff related to her book — starting friday, which means that i won’t be able to go busking. and then panto starts (shudder) saturday: two shows, and two shows on sunday, which means that i won’t even be here to take care of the pets for significant portions of both days… fortunately, i’m picking her up at the airport after sunday’s shows are over.

and, on the unicycle side of things, i think i am actually learning to ride the unicycle… i have been consistently riding, in a “more-or-less” controlled fashion, in a marginally straight line, without falling over, half to three-quarters of the way across the gym, for two weeks now. and, i just got “certified” to come in and use the gym for practicing unicycle on days that we’re not having class, so i actually have a place to practice.

/8 blocks

i now have three /8 blocks in my email filters.

25.0.0.0/8 in the UK, 53.0.0.0/8 in germany, and 133.0.0.0/8 in japan.

the “standard” email filters, built on “and/or” and “contains/does not contain”, break down when you’re dealing with 16.75 MILLION addresses.

they break down because you can’t just filter on 25. which appears in the middle and end of IP addresses, in message ID numbers, and, occaisionally, in the body of the message.

the result is A LOT of false positives: email which i can’t forward to the correct recipient, because it will get filtered AGAIN

which is quite annoying. 😒

so, with the help of my friend robert, i built a regular expression to handle it:

\D25\.\d{1,3}\.\d{1,3}\.\d{1,3}\D

finds non-digit character followed by “25.”, followed by three repititions of one to three digits, interspersed by periods, followed by another non-digit character.

technically, this regex could be adapted to accomodate any IP address, which means that, theoretically, i have a whole new, easier, and faster method of processing spam. 😈

the next step is to learn how to search for a specific range of digits… 😈

ETA 191127 i discovered that you can’t specify a range of digits with a regex. for that, you need a script, which is too much work. also, i determined that i DON’T need the white space character at the beginning and end of the regular expression, because, sometimes, the IP address is surrounded by parentheses, square brackets, or both.

ETA 191128 i changed it from white space character — \s — to non-digit character — \D — because some IP addresses are surrounded by parentheses or square brackets, but some are surrounded by white space characters. the only thing \D doesn’t capture is an empty string, so the IP address can’t be the first thing in the line of text.

and, even with the \D, this regex, modified to capture 27.16.0.0/12 in china, captures 2.2019.11.27.23.41.02, which is part of the message ID on a LEGITIMATE message. 😖😒😠🤬

this is why i’m rerouting these messages, rather than summarily deleting them, which is my inclination… summarily deleting what i think is spam has come back to bite me in the ass often enough that i don’t do it any longer. 😒

oy 😒

this morning i added a second /8 block to my email filters.

for those of you wondering what i’m talking about, a /8 block is the largest block of IP addresses allocated by the IANA.

16,777,216 individual IP addresses.

my first filtered /8 block was in japan. my second one was in germany.

and i STILL get spam from japan and from germany. 😒

it doesn’t seem like it was that long ago that spam was something in a monty python skit, and before that, it was a canned meat byproduct.

it’s not even UCE any longer, because most of it is devoted to scams of one kind or another. actual, commercial email is a tiny fraction of the volumes of script-generated spam, these days.

spam times 16,777,216²… which is a number so large my scientific calculator chokes on it… which is to say, it says 2.81474976711e+14 rather than giving me a number i can understand. 😒