Nov. 04, 2016

2016 Presidential Election Ballot Order, by State

While digging through the AP elections API feed I noticed it contained a BallotOrder attribute for each candidate, so I wanted to explore it a bit.  Table below shows what order the Candidates' names are printed in from state to state, focusing on the two big names.

Some states have a lot of candidates running for president (Colorado has 22) but all states ensured Clinton and Trump were in the top 8 slots.  Clinton's name shows up before Trump's on 29 ballots, and Trump's is listed before Clinton's 22 times.

State Name Order on Ballot Total Number of Presidential Candidates
1 2 3 4 5 6 7 8
AK Clinton Trump 6
AL Clinton Trump 4
AR Clinton Trump 8
AZ Trump Clinton 4
CA Clinton Trump 5
CO Clinton Trump 22
CT Clinton Trump 4
DC Trump Clinton 4
DE Clinton Trump 4
FL Trump Clinton 6
GA Trump Clinton 3
HI Clinton Trump 5
IA Trump Clinton 10
ID Clinton Trump 8
IL Clinton Trump 4
IN Clinton Trump 3
KS Clinton Trump 4
KY Trump Clinton 6
LA Clinton Trump 13
MA Clinton Trump 4
MD Trump Clinton 4
ME Clinton Trump 4
MI Trump Clinton 6
MN Trump Clinton 9
MO Clinton Trump 5
MS Clinton Trump 7
MT Clinton Trump 5
NC Trump Clinton 3
ND Clinton Trump 6
NE Trump Clinton 4
NH Clinton Trump 5
NJ Clinton Trump 9
NM Trump Clinton 8
NV Clinton Trump 6
NY Clinton Trump 8
OH Clinton Trump 5
OK Trump Clinton 3
OR Trump Clinton 4
PA Clinton Trump 5
RI Trump Clinton 5
SC Clinton Trump 7
SD Trump Clinton 4
TN Trump Clinton 7
TX Trump Clinton 4
UT Trump Clinton 10
VA Clinton Trump 5
VT Clinton Trump 6
WA Clinton Trump 7
WI Trump Clinton 7
WV Trump Clinton 5
WY Trump Clinton 6


No comments \ Leave a comment
Oct. 24, 2016

Minecraft Square Moderator Application Age Distribution

Back in 2010 I ran a moderately popular minecraft server called Minecraft Square.  Its niche was fast hardware with tons of ram in the days when such servers didn't affordably exist.  As hosting hardware improved in the industry and my attention span diminished, I sunset and shutdown the server.

However, I kept various server logs and parsing them from time to time is somewhat interesting.  Back then we allowed any players to apply to become a moderator, and asked for some information for a panel to review them as a candidate. Below is a graph of their age distribution collected from 2010 to 2011.


Sample size is approximately 345, and the highest bar there at age 14 was 43 applications.

The moderator applicant ages were self-reported.  Much later, some players admitted they inflated their age in order to improve their chances in the process through perceived maturity, which confirmed what we occasionally suspected at the time. The distribution graph's bell curve suggests to me that many kids probably did the same, since a vast majority of applicants were 11-16.

There's a steep dropoff at age 17, and my speculation is that 17 year olds wanted to round themselves up to 18, a symbolic age of being seen as a trustworthy adult.

No comments \ Leave a comment
Feb. 10, 2015

Levenshtein distance between 10 million usernames and their passwords

Mark Burnett, a security researcher, recently released a collection of 10 million passwords along with their usernames. My question was, how different are 10 million usernames from their passwords?  Taking a tiny bit of time, I performed a simple analysis looking at the Levenshtein distance between them and composed the graph below.

What this means is, if people in this dataset used their username as a password (ex: user dino, password dino), but then changed it a little (password dino1), how many insertions, deletions or substitutions did these users have to make from the set?  See for yourself.

Distance of 0 means usernames and passwords are exactly identical (in the graph below, 213,133 passwords are same as their usernames).  Distance of 1 means one character was added, deleted or changed. And so on...

1 comment \ Leave a comment
Jan. 26, 2015

Interactive form builder with physics

Make sure to click the export button when done.

No comments \ Leave a comment
Oct. 07, 2014

Language spoken at home: LANP05, LANP12 differences

When using the Census Bureau's American Community Survey (ACS5) dataset, there are two fields on the `persons` set that stand out for me: LANP05 and LANP12.  They are "Language spoken at home," except they are split into two forms: "Language spoken at home for data years prior to 2012" and "Language spoken at home for data year 2012."

The data dictionary explains that the set contains two data vintages (-2012, 2012+), but not why.


What's the actual difference?

LANP05 LANP12 Spoken in / Note
966. American Indian 602. Krio Sierra Leone
675. Sindhi Sindh region, Pakistan
689. Uighur AKA Uyghur tili, Uyghurche: Turkic language spoken in a Western province of China
694. Mongolian
750. Micronesian "The twenty Micronesian languages form a family of Oceanic languages..."
761. Trukese Apparently very rare: AKA Chuukese, Austronesian language family spoken primarily on the islands of Chuuk in the Caroline Islands in Micronesia with some speakers on Pohnpei and Guam.
819. Ojibwa AKA Chippewa, spoken in Canada and USA.

So in 2012, the Census Bureau made changes to the language classification but never normalized the fields. Instead, they kept them separate, and reported -9 to fill the missing gaps. While that's annoying to someone who just wants these two fields to be one, it's important to draw attention to the changes and be aware of them.


Combining into one field

Supposing you imported the 15,318,124 rows of persons data into MySQL, here's how to combine the two fields into one:

mysql> alter table persons add LANPX int(11) after LANP05;

This should take about 5 minutes to complete on Amazon-RDS. When ready, set all LANP05 non -9's to LANPX:

mysql> update persons set LANPX=LANP05 where LANP05!=-9;
Query OK, 14774037 rows affected (4 min 55.83 sec)
Rows matched: 14774037  Changed: 14774037  Warnings: 0

And all non -9 LANP12's to LANPX:

mysql> update persons set LANPX=LANP12 where LANP12!=-9;
Query OK, 544087 rows affected (2 min 22.56 sec)
Rows matched: 13276440  Changed: 544087  Warnings: 0

A reality check is that 14774037 + 544087 = 15318124, which is the changed + changed = total number of persons rows.

No comments \ Leave a comment
Apr. 17, 2014

Resizing with maintained aspect ratio applet

Maybe I'm just way too old school, but when dealing with resizing anything, I like to jot down a simple fractional equation. For example, if there's a video that's 530 x 298 pixels and you need to resize that to 505 pixels wide, and maintain the aspect ratio:

First, setup the simple equation like so:

\frac {530}{298}=\frac {505}{y}

Second, cross-multiply the fraction:

530*y = 505 * 298

Then solve for y, divide both sides by 530:

y = \frac{505 * 298}{530}

And you may want to round the answer:

y = 283.9433962264...

y \approx 284

Original Resized (Rounded)


In general terms, if you have:

\frac {a}{b}=\frac {c}{d}

Formulas for solving c and d are:

c = \frac{(a* d)}{b}\ ; d = \frac{(b* c)}{a}

No comments \ Leave a comment
Posts on this blog solely represent my personal opinions and technical experience.

© 2009-2017 Edin (Dino) Beslagic