Simplified Machine-Learning Workflow #3
Author
Anton Antonov
Title
Simplified Machine-Learning Workflow #3
Description
Semantic Analysis (Part 4)
Category
Educational Materials
Keywords
URL
http://www.notebookarchive.org/2020-09-55sv02t/
DOI
https://notebookarchive.org/2020-09-55sv02t
Date Added
2020-09-11
Date Last Modified
2020-09-11
File Size
3.51 megabytes
Supplements
Rights
Redistribution rights reserved



Latent Semantic Analysis (Part 4)
Latent Semantic Analysis (Part 4)
A Wolfram livecoding session
Anton Antonov
January 2020
January 2020
Session overview
Session overview
1
.Review: last session’s matrix object.
1
.1
.With named rows and columns.
2
.Queries representation.
2
.1
.Representing rstudio-conf-2019 abstracts in the vector space of WTC-2019 abstracts.
3
.Making a search engine for:
3
.1
.□ Raku’s documentation, or
3
.2
.■ WTC-2019 abstracts.
4
.Dimension reduction over an image collection.
4
.1
.4
.2
.Representation of unseen mandala images.
Data
Data
Course data
Course data
In[]:=
ResourceFunction["ImportCSVToDataset"]["~/MathFiles/Presentations/Live-coding sessions Latent Semantic Analysis Worflows 2019-2020/Data-breakdown.csv"]
Out[]=
|
In[]:=
WebImage["https://github.com/antononcube/SimplifiedMachineLearningWorkflows-book/tree/master/Data"]
Out[]=
Read WTC-2019 abstracts
Read WTC-2019 abstracts
In[]:=
aWTCRecords=Import["https://raw.githubusercontent.com/antononcube/SimplifiedMachineLearningWorkflows-book/master/Data/Wolfram-Technology-Conference-2019-abstracts.json"];aWTCRecords=Association[Map[#〚1〛Association[#〚2〛]&,aWTCRecords]];Length[aWTCRecords]
Out[]=
182
Read rstudio-conf-2019 abstracts
Read rstudio-conf-2019 abstracts
In[]:=
aRSCRecords=Import["https://raw.githubusercontent.com/antononcube/SimplifiedMachineLearningWorkflows-book/master/Data/RStudio-conf-2019-abstracts.json"];aRSCRecords=Association[Map[#〚1〛Association[#〚2〛]&,aRSCRecords]];Length[aRSCRecords]
Out[]=
61
Image collection
Image collection
In[]:=
SeedRandom[16];ResourceFunction["RandomMandala"][]
Out[]=
In[]:=
WebImage["https://resources.wolframcloud.com/FunctionRepository/resources/RandomMandala"]
Out[]=
Review
Review
In[]:=
aWTCDesriptions=Map[#description&,aWTCRecords];
In[]:=
aWTCDesriptions//Length
Out[]=
182
In[]:=
lsaObjWTC=LSAMonUnit[aWTCDesriptions]⟹LSAMonMakeDocumentTermMatrix[{},Automatic]⟹LSAMonApplyTermWeightFunctions["IDF","TermFrequency","Cosine"]⟹LSAMonExtractTopics["NumberOfTopics"24,"MinNumberOfDocumentsPerTerm"2,Method"NNMF",MaxSteps20]⟹LSAMonEchoTopicsTable["NumberOfTableColumns"8];
»
topics table:
|
|
|
|
|
|
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
Sparse matrix with named rows and columns
Sparse matrix with named rows and columns
In[]:=
smat=lsaObjWTC⟹LSAMonTakeDocumentTermMatrix
Out[]=
SparseArray
|
In[]:=
MatrixForm[smat〚1;;12,1000;;1050〛]
Out[]//MatrixForm=
edit | edition | education | educational | educator | educators | educause | edx | eeg | effect | effective | effectively | effect” | efficacy | efficiency | efficient | efficiently | effort | efforts | ehr” | eigenvalue | eigenvalues | electric | electrical | electrodynamics | electroencephalogram | electromagnetic | electromagnetism | electronic | electronics | electrons | elegance | element | elementary | elements | elliptic | elusive | email | embed | embeddedsql” | embedding | embodied | emerge | emergence | emergency | emergent | emerging | emily | emissions | emitted | emitting | |
Karl.Isensee.Oct.28.9:00.AM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Jayanta.Phadikar.Oct.28.9:00.AM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Tuseeta.Banerjee.Oct.28.1:00.PM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Jan.Poeschko.and.Joel.Klein.Oct.28.1:00.PM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Andrew.Steinacher.Sylvia.Haas.Keiko.Hirayama..Jason.Biggs.and.Luke.Titus.Oct.28.2:00.PM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Oct.28.3:00.PM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Oct.28.4:30.PM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Stephen.Wolfram.Oct.28.6:00.PM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Oct.29.7:30.AM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Russell.Foltz-Smith.Oct.29.8:30.AM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Gerli.Jõgeva.Oct.29.8:30.AM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Claudio.Parazzoli.Oct.29.8:30.AM | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
In[]:=
MatrixPlot[smat]
Out[]=
In[]:=
RowSumsAssociation[smat]
In[]:=
TakeLargest[ColumnSumsAssociation[smat],12]
Out[]=
wolfram183,language105,data100,talk69,new62,mathematica60,functions57,using53,use47,analysis46,learning40,none38
In[]:=
TakeLargest[ColumnSumsAssociation[lsaObjWTC⟹LSAMonTakeWeightedDocumentTermMatrix],12]
Out[]=
none15.1707,data4.21456,wolfram4.16885,new3.70651,language3.48292,functions3.42412,mathematica2.85901,talk2.8295,enjoy2.79883,networking2.71834,join2.65108,using2.53217
Queries using matrix-vector multiplications
Queries using matrix-vector multiplications
In[]:=
ind=ColumnNamesAssociation[smat]["anomalies"]
Out[]=
171
In[]:=
svec=SparseArray[ConstantArray[0,ColumnsCount[smat]]];svec〚ind〛=1;
In[]:=
TakeLargest[RowSumsAssociation[smat.Transpose[{svec}]],3]
Out[]=
Anton.Antonov.Oct.29.1:30.PM3,Tuseeta.Banerjee.Oct.28.1:00.PM0,Jayanta.Phadikar.Oct.28.9:00.AM0
In[]:=
aWTCDesriptions[Keys[%]〚1〛]
Out[]=
In this presentation we show, explain and compare methods for finding anomalies, breaks and outliers in time series. We are interested in finding anomalies in both a single time series and a collection of time series. We (mostly) employ nonparametric methods. First, we look at some motivational examples from well-known datasets. Then we look into definitions of anomalies and definitions for measuring the success of time series anomaly detection. For a single time series, we apply both Wolfram Language built-in algorithms and additional specialized algorithms. We discuss in more detail algorithms based on <i>k</i> -nearest neighbors (KNN), dimension reduction, linear regression, quantile regression and prefix trees. For collections of time series, we discuss: transformations into uniform representations, simple outlier finding based on variable distributions, anomalous trend finding, anomaly finding with KNN and other related algorithms. We are going to discuss how anomaly finding helps in producing faithful simulations of multivariable datasets. Concrete, real-life time series are used in the examples.
In[]:=
MatrixPlot[SparseArray[lsaObjWTC⟹LSAMonNormalizeMatrixProduct[NormalizedRight]⟹LSAMonTakeW]]
Out[]=
In[]:=
lsAnomalInds=FindAnomalies[SparseArray[lsaObjWTC⟹LSAMonNormalizeMatrixProduct[NormalizedRight]⟹LSAMonTakeW],"AnomalyPositions"]
Out[]=
{4,32,33,45,73,93,105,109,119,134,140,174,180}
In[]:=
aWTCDesriptions〚lsAnomalInds〛
Queries representation
Queries representation
Let us see how rstudio-conf-2019 abstracts get represented into the space of WTC-2019 abstracts.
Focus texts (rstudio-conf-2019)
Focus texts (rstudio-conf-2019)
In[]:=
txt1=aRSCRecords〚12〛["Abstract"]
Out[]=
RStudio Package Manager is the newest professional product that helps teams, departments, and entire enterprises organize and centralize package management. If you’ve ever struggled with IT to get access to a new (any?) R package, reproduce an old result, or share your code with others, RStudio Package Manager can help! We’ll introduce the new product, discuss how R repositories can be used to solve problems and take a sneak peek at what is coming in 2019.
In[]:=
txt2=aRSCRecords〚56〛["Abstract"]
Out[]=
My talk will discuss how R, the tidyverse, and the community around R helped me to learn to code and create my first R package. My positive experiences with the resources for learning R and the community itself led me to create a blog detailing my experiences with R as a way to pass along the knowledge that I gained. The next step was to develop my first package. The debkeepr package integrates non-decimal monetary systems of pounds, shillings, and pence into R, making it possible to accurately analyze and visualize historical account books. It is my hope that debkeepr can help bring to light crucial and interesting social interactions that are buried in economic manuscripts, making these stories accessible to a wider audience.
Terms representation
Terms representation
In[]:=
matTerms1=lsaObjWTC⟹LSAMonRepresentByTerms[ToLowerCase@txt1]⟹LSAMonEchoFunctionValue[MatrixForm[#1〚All,Keys[Select[ColumnSumsAssociation[#1],#1>0&]]〛]&]⟹LSAMonTakeValue
»
2019 | access | code | coming | departments | discuss | entire | help | helps | introduce | management | new | package | peek | problems | product | result | share | sneak | solve | used | we’ll | |
1 | 0.168908 | 0.153097 | 0.120526 | 0.211964 | 0.211964 | 0.106177 | 0.168908 | 0.141288 | 0.192912 | 0.136337 | 0.168908 | 0.142391 | 0.58729 | 0.192912 | 0.120526 | 0.358787 | 0.192912 | 0.168908 | 0.179393 | 0.160341 | 0.0932808 | 0.146823 |
Out[]=
SparseArray
|
In[]:=
matTerms2=lsaObjWTC⟹LSAMonRepresentByTerms[ToLowerCase@txt2]⟹LSAMonEchoFunctionValue[MatrixForm[#1〚All,Keys[Select[ColumnSumsAssociation[#1],#1>0&]]〛]&]⟹LSAMonTakeValue
»
accessible | account | analyze | audience | blog | bring | code | community | create | crucial | detailing | develop | discuss | economic | experiences | gained | help | helped | interactions | interesting | knowledge | learn | learning | led | light | making | non | package | possible | resources | social | step | stories | systems | talk | way | wider | |
1 | 0.153681 | 0.153681 | 0.116965 | 0.127734 | 0.168859 | 0.142912 | 0.0960161 | 0.307362 | 0.192032 | 0.134559 | 0.168859 | 0.127734 | 0.0845845 | 0.142912 | 0.307362 | 0.168859 | 0.112556 | 0.168859 | 0.153681 | 0.121963 | 0.112556 | 0.0987903 | 0.0700689 | 0.168859 | 0.153681 | 0.269118 | 0.142912 | 0.350894 | 0.116965 | 0.142912 | 0.121963 | 0.134559 | 0.168859 | 0.0910175 | 0.0434591 | 0.101787 | 0.168859 |
Out[]=
SparseArray
|
In[]:=
Norm@*SparseArray/@{matTerms1,matTerms2}
Out[]=
{1.,1.}
In[]:=
MatrixForm[matTerms1.Transpose[matTerms2]]
Out[]//MatrixForm=
1 | |
1 | 0.242533 |
Topics representation
Topics representation
In[]:=
matTopics1=lsaObjWTC⟹LSAMonRepresentByTopics[ToLowerCase@txt1]⟹LSAMonEchoFunctionValue[MatrixForm[#1〚All,Keys[Select[ColumnSumsAssociation[#1],#1>0&]]〛]&]⟹LSAMonTakeValue;matTopics1=WeightTermsOfSSparseMatrix[matTopics1,"None","None","Cosine"]
»
cloud-web-applications | graph-peek-familiarize | developments-functions-special | package-game-pap | blockchain-ledger-distributed | asymptotic-art-compute | cloud-mobile-introduction | finding-time-series | financial-data-sources | models-physical-model | wolfram|alpha-edition-notebook | clear=-<br-none | |
1 | 0.00730076 | 0.136995 | 0.0100969 | 0.255452 | 0.00428355 | 0.00552975 | 0.0118622 | 0.0289472 | 0.00411564 | 0.00318786 | 0.0323035 | 0.0131236 |
Out[]=
SparseArray
|
In[]:=
TakeLargest[ColumnSumsAssociation[matTopics1],UpTo[6]]
Out[]=
package-game-pap0.853837,graph-peek-familiarize0.457898,wolfram|alpha-edition-notebook0.107973,finding-time-series0.0967548,clear=-<br-none0.0438652,cloud-mobile-introduction0.0396488
In[]:=
matTopics2=lsaObjWTC⟹LSAMonRepresentByTopics[ToLowerCase@txt2]⟹LSAMonEchoFunctionValue[MatrixForm[#1〚All,Keys[Select[ColumnSumsAssociation[#1],#1>0&]]〛]&]⟹LSAMonTakeValue;matTopics2=WeightTermsOfSSparseMatrix[matTopics2,"None","None","Cosine"]
»
neural-net-nets | developments-functions-special | package-game-pap | blockchain-ledger-distributed | asymptotic-art-compute | educators-classroom-sylva | cloud-mobile-introduction | finding-time-series | models-physical-model | librarylink-c++-degree | spatial-point-patterns | computational-thinking-humanities | |
1 | 0.00501161 | 0.0116588 | 0.168331 | 0.00462197 | 0.00521703 | 0.0191225 | 0.0304247 | 0.00702409 | 0.0191461 | 0.0453552 | 0.0000467977 | 0.0524092 |
Out[]=
SparseArray
|
In[]:=
TakeLargest[ColumnSumsAssociation[matTopics2],UpTo[6]]
Out[]=
package-game-pap0.887771,computational-thinking-humanities0.276405,librarylink-c++-degree0.239202,cloud-mobile-introduction0.160459,models-physical-model0.100976,educators-classroom-sylva0.100851
In[]:=
Norm@*SparseArray/@{matTopics1,matTopics2}
Out[]=
{1.,1.}
In[]:=
MatrixForm[matTopics1.Transpose[matTopics2]]
Out[]//MatrixForm=
1 | |
1 | 0.747378 |
Search engine
Search engine
Recommender object
Recommender object
In[]:=
matW=lsaObjWTC⟹LSAMonNormalizeMatrixProduct[NormalizedRight]⟹LSAMonTakeW;
In[]:=
smrObjWTC=SMRMonUnit[]⟹SMRMonCreate[<|"Terms"lsaObjWTC⟹LSAMonTakeDocumentTermMatrix,"Topics"matW|>]⟹SMRMonApplyNormalizationFunction["Cosine"]⟹SMRMonEchoFunctionContext[#matrices&];
»
TermsSparseArray
,TopicsSparseArray
|
|
Recommendations by terms
Recommendations by terms
In[]:=
prof=Select[ColumnSumsAssociation[matTerms1],#>0&]
Out[]=
20190.168908,access0.153097,code0.120526,coming0.211964,departments0.211964,discuss0.106177,entire0.168908,help0.141288,helps0.192912,introduce0.136337,management0.168908,new0.142391,package0.58729,peek0.192912,problems0.120526,product0.358787,result0.192912,share0.168908,sneak0.179393,solve0.160341,used0.0932808,we’ll0.146823
In[]:=
smrObjWTC⟹SMRMonRecommendByProfile[prof,12]⟹SMRMonJoinAcross[aWTCDesriptions]⟹SMRMonEchoValue⟹SMRMonProveByMetadata[prof,{"Claudio.Parazzoli.Oct.29.8:30.AM"}]⟹SMRMonEchoValue;
»
value:
|
»
value:Claudio.Parazzoli.Oct.29.8:30.AMpackage1.,solve0.273018,we’ll0.25,help0.240576,problems0.205225,used0.158833
In[]:=
txt1
Out[]=
RStudio Package Manager is the newest professional product that helps teams, departments, and entire enterprises organize and centralize package management. If you’ve ever struggled with IT to get access to a new (any?) R package, reproduce an old result, or share your code with others, RStudio Package Manager can help! We’ll introduce the new product, discuss how R repositories can be used to solve problems and take a sneak peek at what is coming in 2019.
Recommendations by topics
Recommendations by topics
In[]:=
prof
Out[]=
20190.168908,access0.153097,code0.120526,coming0.211964,departments0.211964,discuss0.106177,entire0.168908,help0.141288,helps0.192912,introduce0.136337,management0.168908,new0.142391,package0.58729,peek0.192912,problems0.120526,product0.358787,result0.192912,share0.168908,sneak0.179393,solve0.160341,used0.0932808,we’ll0.146823
In[]:=
prof2=Select[ColumnSumsAssociation[matTopics1],#>0&]
Out[]=
cloud-web-applications0.0244025,graph-peek-familiarize0.457898,developments-functions-special0.0337485,package-game-pap0.853837,blockchain-ledger-distributed0.0143176,asymptotic-art-compute0.0184829,cloud-mobile-introduction0.0396488,finding-time-series0.0967548,financial-data-sources0.0137563,models-physical-model0.0106553,wolfram|alpha-edition-notebook0.107973,clear=-<br-none0.0438652
In[]:=
smrObjWTC⟹SMRMonRecommendByProfile[prof2,12]⟹SMRMonJoinAcross[aWTCDesriptions]⟹SMRMonEchoValue⟹SMRMonProveByMetadata[prof2,"Eric.Jacopin.Oct.29.2:30.PM"]⟹SMRMonEchoValue⟹SMRMonProveByMetadata[prof2,"Jayanta.Phadikar.Oct.28.9:00.AM"]⟹SMRMonEchoValue;
»
value:
|
»
value:Eric.Jacopin.Oct.29.2:30.PMpackage-game-pap1.,graph-peek-familiarize0.536283,finding-time-series0.113318,blockchain-ledger-distributed0.0167685
»
value:Jayanta.Phadikar.Oct.28.9:00.AMpackage-game-pap1.,cloud-web-applications0.0285797,asymptotic-art-compute0.0216469,models-physical-model0.0124793
In[]:=
txt1
Out[]=
RStudio Package Manager is the newest professional product that helps teams, departments, and entire enterprises organize and centralize package management. If you’ve ever struggled with IT to get access to a new (any?) R package, reproduce an old result, or share your code with others, RStudio Package Manager can help! We’ll introduce the new product, discuss how R repositories can be used to solve problems and take a sneak peek at what is coming in 2019.
In[]:=
aWTCDesriptions["Jayanta.Phadikar.Oct.28.9:00.AM"]
Out[]=
Learn the skills, knowledge and tools to develop your own project with the Wolfram Language. Start with fundamental programming concepts, including iterations, nesting, pattern matching and function definitions. Then move on to more advanced techniques for package development and deployment. Along the way learn tips and tricks for good programming practices, debugging and writing efficient code in the Wolfram Language.
Random mandala images topics
Random mandala images topics
Image collection
Image collection
In[]:=
SeedRandom[12];ResourceFunction["RandomMandala"]["RotationalSymmetryOrder"6,"ConnectingFunction"FilledCurve@*BezierCurve]
Out[]=
In[]:=
SeedRandom[12];ResourceFunction["RandomMandala"]["RotationalSymmetryOrder"3,"ConnectingFunction"FilledCurve@*BezierCurve]
Out[]=
In[]:=
k=200;SeedRandom[14];imgs=MapThread[(ResourceFunction["RandomMandala"]["RotationalSymmetryOrder"6,"ConnectingFunction"FilledCurve@*BezierCurve])&,{RandomChoice[{18,17,4243,1,113,4818},k],RandomChoice[{6,3},k]}];
In[]:=
Magnify[Multicolumn[imgs,20],0.4]
Out[]=
LSA preparation
LSA preparation
In[]:=
AbsoluteTiming[imgs2=ImageResize[#,{100,100}]&/@imgs;]
Out[]=
{13.74,Null}
LSA extract topics
LSA extract topics
In[]:=
RandomSample[imgs2,6]
Out[]=
,
,
,
,
,
In[]:=
imgVecs=Flatten[ImageData[Binarize@ColorNegate@ColorConvert[#,"Grayscale"]]]&/@imgs2;
In[]:=
MatrixPlot[imgVecs,MaxPlotPoints800]
Out[]=
In[]:=
Magnify[ColumnForm[ListLinePlot[#,PlotRangeAll,AspectRatio1/6,ImageSizeLarge]&/@RandomSample[imgVecs,4]],0.7]
In[]:=
AbsoluteTiming[lsaObj=LSAMonUnit[]⟹LSAMonSetDocumentTermMatrix[SparseArray[imgVecs]]⟹LSAMonApplyTermWeightFunctions["None","None","Cosine"]⟹LSAMonExtractTopics[40,Method"NNMF","MaxSteps"12,"MinNumberOfDocumentsPerTerm"0]⟹LSAMonNormalizeMatrixProduct[NormalizedLeft];]
Out[]=
{59.7539,Null}
In[]:=
ListPlot[Norm/@SparseArray[lsaObj⟹LSAMonTakeH],FillingAxis,PlotRangeAll,PlotTheme"Scientific"]
Out[]=
In[]:=
lsaObj⟹LSAMonNormalizeMatrixProduct[NormalizedRight]⟹LSAMonEchoFunctionContext[ImageAdjust[Image[Partition[#,ImageDimensions[imgs2〚1〛]〚1〛]]]&/@SparseArray[#H]&];
»
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Basis extraction
Basis extraction
In[]:=
H=SparseArray[lsaObj⟹LSAMonNormalizeMatrixProduct[NormalizedRight]⟹LSAMonTakeH]
Out[]=
SparseArray
| |||||
Data not saved. Save now |
In[]:=
lsBasisImgs=ImageAdjust[Image[Partition[#,ImageDimensions[imgs2〚1〛]〚1〛]]]&/@SparseArray[H]
Out[]=
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
In[]:=
Binarize/@lsBasisImgs
Out[]=
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
In[]:=
Colorize[#,ColorFunction"Rainbow"]&/@RandomSample[lsBasisImgs,4]
Out[]=
,
,
,
In[]:=
ColorNegate/@Table[ImageAdjust[Blend[#&/@RandomSample[lsBasisImgs,4]]],12]
Out[]=
,
,
,
,
,
,
,
,
,
,
,
Approximation
Approximation
Pick a known mandala image:
In[]:=
ind=RandomChoice[Range[Length[imgs2]]]imgTest=imgs2〚ind〛
Out[]=
191
Out[]=
Or generate a new random mandala:
In[]:=
testMandala=ResourceFunction["RandomMandala"]["ConnectingFunction"FilledCurve@*BezierCurve];imgTest=ImageResize[Image[testMandala],{100,100}]
Out[]=
In[]:=
imgTestVec=Flatten[ImageData[ColorNegate[ColorConvert[ImageResize[imgTest,ImageDimensions[imgs2〚1〛]],"Grayscale"]]]];Length[imgTestVec]
Out[]=
10000
In[]:=
matTest=ToSSparseMatrix[SparseArray[{imgTestVec}],"RowNames"{"TestImage"},"ColumnNames"Map[ToString,Range[Length[imgTestVec]]]];
In[]:=
matReprsentation=lsaObj⟹LSAMonRepresentByTopics[matTest]⟹LSAMonTakeValue;
In[]:=
lsCoeff=Normal@SparseArray[matReprsentation〚1,All〛];ListPlot[lsCoeff,FillingAxis,PlotRangeAll]
Out[]=
In[]:=
H=SparseArray[lsaObj⟹LSAMonNormalizeMatrixProduct[NormalizedRight]⟹LSAMonTakeH];
In[]:=
vecReprsentation=lsCoeff.H;
In[]:=
ImageAdjust@ColorNegate@Image[Clip[Partition[vecReprsentation,ImageDimensions[imgs2〚1〛]〚1〛],{0,1}]]
Out[]=
Blending of basis images
Blending of basis images
In[]:=
AbsoluteTiming[directBlendingImages=Table[RemoveBackground@ImageAdjust[Blend[Colorize[#,ColorFunctionRandomChoice[{"IslandColors","FruitPunchColors","AvocadoColors","Rainbow"}]]&/@RandomChoice[lsBasisImgs,4],RandomReal[1,4]]],12];]ImageCollage[directBlendingImages,BackgroundWhite,ImagePadding3,ImageSize600]
Out[]=
{0.678834,Null}
Out[]=


Cite this as: Anton Antonov, "Simplified Machine-Learning Workflow #3" from the Notebook Archive (2020), https://notebookarchive.org/2020-09-55sv02t

Download

