Word2Vec Analysis of Harry Potter

Word2vec is a very cool algorithm that uses neural networks to map words to feature vectors. Those vectors then have interesting properties. You can use these vectors to cluster words into groups��� find words that are most similar in an unsupervised fashion.



I ran word2vec on the 7 harry potter books and ran some cosine similarities on words i thought might be interesting. (using the open source Gensim library I love) My favorite is the word ���password��� below��� which revealed a list of passwords to the Gryffendor Common Room.



The top word of each list is the focus word��� and then a list of the most similar words thereafter.



(64 dimensions : 1 epoch)

1.0000000000000002:<expecto

0.8790060473734644:<patronum

0.755879251292692:<pettigrews

0.6931649914220047:<scabbers

0.6921019768155199:<pettigrew

0.6854965539870754:<lumos

0.6800035990521911:<groped

0.664763725362591:<buckbeaks

0.6386204400460198:<lupins

0.6368874336372525:<forcing



(256 dimensions : 10 epochs)

1.0000000000000002:<expecto

0.881676712172453:<patronum

0.66633365764735:<expec

0.5100899639798923:<patrono

0.4183506799917183:<poppy

0.4151233329572086:<wailed

0.4093665595659816:<nicknamed

0.3973543132151556:<riddikulus

0.3944545196592748:<fistbeating

0.39103755743544927:<bristling



(256 dimensions : 10 epochs)

1.0:<lumos

0.42209239730743003:<imperio

0.40507159450163904:<ignited

0.3758083182732268:<tergeo

0.3744004687853999:<episkey

0.3658264199741773:<portus

0.35926607923955595:<uhoh

0.3574845249683229:<serpensortia

0.35644744188201566:<wandtip

0.3503601500539219:<dissendium



(64 dimensions : 1 epoch)

1.0:<weasley

0.7899136471577068:<cole

0.7580984973288645:<weasleys

0.704308962845765:<cattermole

0.696147590313737:<figg

0.6615516871288135:<bill

0.5775892972766857:<mr

0.5623844627363042:<mrs

0.547272989628803:<crouch

0.5346744691953205:<brightly



(256 dimensions : 10 epochs)

1.0:<weasley

0.5571492216402788:<mrs

0.525661506461713:<mr

0.5122609732187181:<redheads

0.47470149235833253:<greatauntie

0.4636894081306857:<basil

0.4615857402277965:<wwhats

0.44954942844918266:<whew

0.44593325992665894:<pranksters

0.4366713328133361:<weasleys



(64 dimensions : 1 epoch)

1.0000000000000002:<potter

0.7040619432963181:<severus

0.6994994807105462:<thank

0.6875348923351211:<sir

0.6822709821434811:<please

0.632136736822497:<may

0.6291806648311767:<cannot

0.6284403688596681:<yes

0.624241093676769:<kindly

0.6217287188964961:<will



(256 dimensions : 10 epochs)

0.9999999999999999:<potter

0.3901893042844876:<humble

0.38913611920424174:<rotter

0.36768347738598195:<heed

0.3666138243923663:<heroworshipped

0.3587120147197108:<woooooooo

0.34933445012876724:<characters

0.3440882110680894:<dunderbore

0.344022635479845:<questionall

0.34103571192794435:<pining



(64 dimensions : 1 epoch)

1.0000000000000002:<gryffindor

0.74139062089133:<team

0.734880573018089:<tower

0.7231970430099021:<slytherin

0.7021615040969098:<common

0.7018928824840882:<ravenclaw

0.6585783668158374:<points

0.6540640450146518:<spinnet

0.64507063817762:<bell

0.6289645923037345:<goal



(256 dimensions : 10 epochs)

1.0:<gryffindor

0.4581853977277023:<hufflepuff

0.4420478362268596:<tower

0.44149319299048684:<points

0.4410251892894441:<slytherin

0.4188516239321704:<ravenclaw

0.3600329597460896:<gryffindors

0.35418207017889025:<team

0.34891474092816965:<wellearned

0.3404098626732141:<penalty



(64 dimensions : 1 epoch)

0.9999999999999999:<ravenclaw

0.781214400477696:<hufflepuff

0.7512446691492253:<slytherin

0.7254427881734692:<captain

0.7208340720937167:<oclock

0.7147691706800555:<lee

0.7147331227062966:<montague

0.7018928824840882:<gryffindor

0.697691445889272:<team

0.6839508546137936:<wednesday



(256 dimensions : 10 epochs)

1.0:<ravenclaw

0.5732267167869871:<hufflepuff

0.43935832571981465:<slytherin

0.43555165040705907:<rowena

0.4188516239321704:<gryffindor

0.407678924818592:<turpin

0.38709404961590305:<helga

0.3708434204576607:<seeker

0.35874384841804063:<huffepuff

0.35602194786273583:<loses



(64 dimensions : 1 epoch)

1.0:<hufflepuff

0.7921196174194491:<fifth

0.7848088164254446:<seventh

0.781214400477696:<ravenclaw

0.7780135594276192:<fourth

0.7568581966579454:<hufflepuffs

0.7452088578170768:<slytherin

0.7329618627066322:<slytherins

0.7241609971259948:<thursday

0.7202963910695652:<gryffindors



(256 dimensions : 10 epochs)

1.0:<hufflepuff

0.5732267167869871:<ravenclaw

0.560598634502152:<slytherin

0.4581853977277023:<gryffindor

0.41404169865069024:<points

0.41335251749481483:<helga

0.40705600406104697:<pushover

0.3964490683438199:<applauding

0.39567843096475613:<loses

0.39244432848771427:<match



(64 dimensions : 1 epoch)

1.0000000000000002:<slytherin

0.7512446691492253:<ravenclaw

0.7452088578170768:<hufflepuff

0.7238327202395243:<team

0.7231970430099021:<gryffindor

0.7165509326427633:<captain

0.7030698136643037:<bell

0.6922161358648197:<goal

0.6914818449404438:<montague

0.6612854741883617:<pucey



(256 dimensions : 10 epochs)

1.0:<slytherin

0.560598634502152:<hufflepuff

0.44370546214405354:<slytherins

0.4410251892894441:<gryffindor

0.43935832571981465:<ravenclaw

0.4111100118874837:<team

0.3960522941982452:<captain

0.3834171440911241:<gains

0.3816636526374869:<pushover

0.37815919943099136:<fen



(64 dimensions : 1 epoch)

1.0:<hermione

0.6373033872867323:<she

0.5360344746735539:<neville

0.5274756001139249:<he

0.5214301504925287:<ginny

0.5109574888160943:<ron

0.4999163834053927:<luna

0.49574563180394704:<harry

0.451967913703171:<romilda

0.446476077232399:<her



(256 dimensions : 10 epochs)

1.0000000000000002:<hermione

0.5042554769286905:<ron

0.4808255935304602:<she

0.3717110232643911:<ginny

0.35913453543144513:<acidly

0.35306516236363134:<ermyknee

0.3515929674760651:<encouragingly

0.35124423149024275:<uhoh

0.3455436053959502:<grabbing

0.34365264010683094:<lavender



(64 dimensions : 1 epoch)

1.0:<dumbledore

0.7867909396010355:<snape

0.652590036205367:<slughorn

0.6378681113816113:<voldemort

0.6120050198022656:<aberforth

0.6019010767912868:<he

0.5929735071374835:<scrimgeour

0.5840436894354705:<sirius

0.5553253520856778:<dumbledores



(256 dimensions : 10 epochs)

1.0:<dumbledore

0.49688884807669753:<snape

0.43320641875023586:<fudge

0.42104734145946027:<voldemort

0.409769764165034:<corpses

0.40586327560858065:<lupin

0.3992514094799346:<dumbledores

0.3984795966379898:<headmaster

0.36481567503440565:<devastating

0.35980117000272555:<riddle



(64 dimensions : 1 epoch)

1.0000000000000002:<spell

0.5705834288105536:<jinx

0.5528068188658538:<clear

0.5308173140619804:<however

0.5256446406146716:<crumplehorned

0.5217704552103746:<hit

0.5214290685798957:<latter

0.5210853653286843:<sectumsempra

0.5204984581955842:<dolohov

0.5172019428367624:<made



(256 dimensions : 10 epochs)

0.9999999999999999:<spell

0.45622019920423756:<spells

0.38503414331623603:<charm

0.3744579769148505:<retaliate

0.3365386087137521:<stunning

0.31299212123460146:<fourpoint

0.3047318421865308:<defensive

0.3018663449722407:<jinx

0.2986278064898596:<practicing

0.29811271395978733:<hit



(64 dimensions : 1 epoch)

1.0:<chamber

0.7516236239509654:<staffroom

0.7097278407351896:<basilisk

0.6396829682468098:<secrets

0.6303229904669257:<messages

0.6292212718692173:<pages

0.6255458905896577:<gargoyle

0.623691053534629:<stage

0.6189105698380645:<sealed

0.6165084868604697:<hilt



(256 dimensions : 10 epochs)

1.0:<chamber

0.527350716695893:<secrets

0.45803538999265264:<forsaken

0.4509279299021009:<fanciful

0.3937455721042796:<unseal

0.3737416685904814:<roosters

0.35312885285278145:<drawingroom

0.3358263562408226:<unleash

0.32925459347667607:<daubed

0.32405200326296857:<runin



(128 dimensions : 1 epoch)

1.0:<black

0.7090495706356712:<thick

0.6825056469408932:<silver

0.6687549059638025:<white

0.6671827818068091:<thin

0.6595272052625423:<green

0.6559287359085553:<tiny

0.6520876963925315:<glass

0.6499197812776186:<bright

0.6376775975420493:<hair



(256 dimensions : 10 epochs)

1.0000000000000002:<black

0.4187862746289444:<ravaged

0.41376806599272925:<bald

0.3964342510912526:<silky

0.39399279699897416:<stripes

0.3782092012451359:<broadshouldered

0.3779053329184092:<graying

0.37579439331917774:<paleeyed

0.3734468240600482:<atop

0.37307859100552243:<leathery



(128 dimensions : 1 epoch)

1.0:<requirement

0.6805251004125765:<common

0.58822928078406:<detail

0.5869718509295418:<gaunts

0.5677933609626798:<portrait

0.5612000668830176:<rest

0.5611572795099765:<space

0.5554348127278043:<marquee

0.5519591805530948:<middle

0.5455370331332081:<slightest



(256 dimensions : 10 epochs)

0.9999999999999998:<requirement

0.4016798211784335:<unknowable

0.3867069097055046:<jampacked

0.37050061385790234:<room

0.3616482568609007:<partlife

0.3557306897710399:<assaulted

0.35489547684406436:<compression

0.33972003944909956:<common

0.32747964915808203:<upholstery

0.3205128295096265:<reliving



(128 dimensions : 1 epoch)

1.0000000000000002:<voldemort

0.6883758949036237:<voldemorts

0.6882314118148244:<bellatrix

0.6687046547843726:<sirius

0.6456113933094144:<dumbledores

0.6378681113816113:<dumbledore

0.6198750862701423:<lord

0.601800677476979:<prophecy

0.5979378004431092:<connection

0.5976976164728673:<ariana



(256 dimensions : 10 epochs)

1.0:<voldemort

0.5231034885732724:<lord

0.4949022273659753:<voldemorts

0.42104734145946027:<dumbledore

0.40269571073980126:<kill

0.39346439983965364:<inindeed

0.3625195714553203:<snape

0.36251702487473614:<elm

0.35635136288239055:<merciful

0.35450715479438577:<prostrate



(128 dimensions : 1 epoch)

1.0:<snape

0.7867909396010355:<dumbledore

0.7602249542226724:<slughorn

0.6055605238284325:<bellatrix

0.6030688726296658:<mcgonagall

0.5929395283735124:<phineas

0.5913798507548866:<umbridge

0.5626612723245389:<narcissa

0.5592886705523973:<horace

0.5535260402939475:<dumbledores



(256 dimensions : 10 epochs)

1.0:<snape

0.49688884807669753:<dumbledore

0.39141319194131824:<snapes

0.3788874389546836:<severus

0.3625195714553203:<voldemort

0.3602838216909781:<malfoy

0.3495411504557353:<mcgonagall

0.3432543679925834:<heroworshipped

0.3407031113901212:<silkily

0.3385766904895776:<bbbut



(128 dimensions : 1 epoch)

1.0000000000000002:<password

0.6759946677363984:<lady

0.6260981919566827:<north

0.6258023922269232:<tower

0.6253624288287926:<owlery

0.6189141007115594:<fat

0.6020323426260975:<afterward

0.5984440756682315:<waking

0.5788408987574969:<cave

0.5730507346420423:<heading



(256 dimensions : 10 epochs)

1.0:<password

0.47446983465719267:<fritters

0.4712847308362276:<lady

0.4654321201001382:<banana

0.4440026686633822:<flibbertigibbet

0.37558294153078775:<ladys

0.3705196683435887:<abstinence

0.36897045258759176:<fat

0.3470203296601344:<portrait

0.3172485840936194:<festive



(128 dimensions : 1 epoch)

1.0:<chocolate

0.7724365211403441:<frog

0.7337068328667901:<mug

0.7211389912598347:<squeak

0.7203372114689538:<mouthful

0.7123522162102919:<meat

0.7110375068682293:<male

0.7029104114344779:<swig

0.6955470835808224:<drops

0.6901040539654248:<blew



(256 dimensions : 10 epochs)

0.9999999999999999:<chocolate

0.5475055711590749:<frog

0.5431921714687046:<frogs

0.47751803650826924:<cards

0.45420825957039807:<cakes

0.43692834317867524:<marmalade

0.4356386019624927:<tarts

0.43098893267906485:<abbot

0.41999201661912083:<chopped

0.4170767831464866:<pasties



(128 dimensions : 1 epoch)

1.0:<phoenix

0.6477570284902336:<wizengamot

0.611456341909926:<order

0.601307062292621:<recent

0.5894212574676222:<warlock

0.5777717037132631:<educational

0.5672181094594285:<charges

0.5577200810058244:<mysteries

0.5560769924913356:<ministers

0.5528013856762239:<quibbler



(256 dimensions : 10 epochs)

0.9999999999999998:<phoenix

0.44566492530943996:<feather

0.4307836965454031:<order

0.41943464449038814:<maple

0.34291846139813786:<fawkes

0.33463728125471814:<cuttlebone

0.3323033150650069:<headquarters

0.33116604341560135:<brethren

0.31389382936668453:<tto

0.305949945529935:<cores

 •  0 comments  •  flag
Share on Twitter
Published on November 23, 2014 04:00
No comments have been added yet.


Andrew W. Trask's Blog

Andrew W. Trask
Andrew W. Trask isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Andrew W. Trask's blog with rss.