1 사전 준비

트위터 데이터 분석 보고서에 나와 있는 다양한 기능을 자동화하기 위해서는 몇가지 사전 준비 작업이 필요하다.

  • Makefile 작성
  • Rscript 쉘 스크립트 작성 (파라미터 전달 포함)
  • 파라미터 Makefile 작성: 해쉬태그를 파라미터로 전달

1.1 Rscript 쉘 스크립트 작성

get_data() 함수에 search_tweets() 함수를 자유로이 다루려면 파라미터를 넣어 다양한 해쉬태그를 트윗 갯수와 함께 가져오는 것이 필요하다. 이를 위해서 앞서 compendium_301에서 작성한 스크립트를 일부 수정해야 한다.

R/ 디렉토리 get_data.R 스크립트에서 인자를 받을 수 있도록 변화를 주고 자료형도 맞춰준다.

## R/ 디렉토리 get_data.R 스크립트
# 0. 환경설정 -----
library(rtweet)  # install.packages("rtweet")
library(tidyverse)

args <- commandArgs(trailing=TRUE)

topic <- args[1]
num_twits <- args[2]

get_data <- function(topic, num_twits) {
  
  tw_dat <- search_tweets(topic, n = num_twits, include_rts = TRUE, lang = "ko")

  tw_dat %>% 
    write_rds("data/tw_dat.rds")
}

get_data(topic, as.integer(num_twits))

rscript 명령어를 R/get_data.R R 스크립트를 실행시키는데 인자를 두개 해쉬태그와 트윗갯수를 함께 넣어 실행시킨다.

rscript R/get_data.R '#불평등' 100
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
ls -al data/
total 112
drwxr-xr-x   4 statkclee  staff    128 Jan 26 04:05 .
drwxr-xr-x  13 statkclee  staff    416 Jan 26 04:05 ..
-rw-r--r--   1 statkclee  staff   1158 Jan 26 04:05 top_users_activity.rds
-rw-r--r--   1 statkclee  staff  50577 Jan 26 04:09 tw_dat.rds

1.2 그래프 생성

앞서 해쉬태그 연관 트윗 데이터를 가져와서 로컬 파일로 저장시키는 것과 동일하게 트윗 추세를 시각화하는 그래프를 생성하여 로컬 파일로 .png 저장을 시킨다. 이를 위해서 마찬가지로 rsctipt에 인자를 넘기는 방식으로 R 스크립트를 일부 변형시킨다.

# 0. 환경설정 -----
library(rtweet)  # install.packages("rtweet")
library(tidyverse)
library(extrafont)
loadfonts()

# dir.create("processed")

args <- commandArgs(trailing=TRUE)
topic <- args[1]

graph_trend <- function(topic) {
  
  tw_dat <- read_rds("data/tw_dat.rds")
  
  ts_data(tw_dat, by="hours") %>% 
    ggplot(aes(x=time, y=n)) +
    geom_line() +
    geom_point() +
    labs(x="", y="트윗횟수", title=paste0("시간별 ", topic, " 트윗 추세")) +
    theme_bw(base_family = "NanumGothic") +
    theme(legend.position = "top")
  
  ggsave("processed/twit_trend.png")
}

graph_trend(topic)

이제 스크립트를 생성시키게 되면 로컬 파일로 processed/twit_trend.png 그래프 파일을 생성시킬 수 있게 된다.

rscript R/graph_trend.R '#불평등'
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Saving 7 x 7 in image

이제 로컬 파일에 저장된 그래프 파일을 가져온다.

트윗 추세

2 Makefile 제작 1

권재명 박사님의 안내에 따라 http://wiki.ktug.org/wiki/wiki.php/설치하기MacOSX/MacTeX을 참조하여 일단 \(\LaTex\) / \(\xeLaTex\)를 설치하고, 특히 나눔폰트(Nanum*)를 다운받아 설치한다. 시간이 약 2시간 정도 소요되니 미리 충분히 준비를 한다.

sudo tlmgr repository add http://ftp.ktug.org/KTUG/texlive/tlnet ktug
sudo tlmgr pinning add ktug "*"
sudo tlmgr install nanumttf hcr-lvt
sudo tlmgr update --all --self

본격적으로 Makefile을 제작하여 본다. make cleanmake all을 만들어 코드를 깔끔하게 정리한다.

# Makefile
DATA = data
REPORT = report
TEMP = processed

all: report/make_report.html

$(DATA)/tw_dat.rds:
    Rscript R/get_data.R "#불평등" 100

$(TEMP)/twit_trend.png: $(DATA)/tw_dat.rds
    Rscript R/graph_trend.R "#불평등"

$(REPORT)/make_report.html: $(DATA)/tw_dat.rds $(TEMP)/twit_trend.png
    Rscript R/make_report.R

clean:
    rm -rf $(DATA)/tw_dat.rds 
    rm -rf $(TEMP)/twit_trend.png 
    rm -rf $(REPORT)/make_report.html $(REPORT)/make_report.pdf $(REPORT)/make_report.docx
    rm -rf $(REPORT)/*.tex $(REPORT)/*.log 

report.R 파일에는 한글을 처리하기 위해서 한글 폰트와 더불어 latex_enginexelatex로 지정한다.

PDF 보고서 파일 생성을 위해서 pdf_document() 함수에 latex_engine과 더불어 mainfont='NanumGothic'를 지정한다. 특히, \(\LaTeX\)에서 나눔폰트를 사전에 설치하여 PDF 파일이 깨지는 일을 미연에 방지한다.

## make_report.R 파일
library(tidyverse)
library(rmarkdown)

rmarkdown::render("report/make_report.Rmd",
                  output_format="html_document",
                  output_file = "make_report.html",
                  clean = TRUE,
                  encoding = 'UTF-8',
                  output_dir = "report")

rmarkdown::render("report/make_report.Rmd",
                  output_format = pdf_document(toc=TRUE, latex_engine = 'xelatex',
                                 pandoc_args = c("--variable", "mainfont=NanumGothic")),
                  clean = TRUE,
                  output_file = "make_report.pdf",
                  encoding = 'UTF-8',
                  output_dir = "report")

rmarkdown::render("report/make_report.Rmd",
                  output_format = word_document(reference_docx = "../assets/word_template.docx"),
                  clean = TRUE,
                  output_file = "make_report.docx",
                  encoding = 'UTF-8',
                  output_dir = "report")

상기 파일을 make all 명령어로 실행시키게 될 경우 자동으로 report\ 디렉토리에 make_report.pdf, make_report.docx, make_report.html 파일이 생성되게 된다.

make all
Rscript R/analyze_activity.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Saving 7 x 7 in image
Rscript R/visualize_retweets.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 

Attaching package: ‘igraph’

The following objects are masked from ‘package:dplyr’:

    as_data_frame, groups, union

The following objects are masked from ‘package:purrr’:

    compose, simplify

The following object is masked from ‘package:tidyr’:

    crossing

The following object is masked from ‘package:tibble’:

    as_data_frame

The following objects are masked from ‘package:stats’:

    decompose, spectrum

The following object is masked from ‘package:base’:

    union

Warning message:
package ‘igraph’ was built under R version 3.5.2 
null device 
          1 
Rscript R/visualize_nlp.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Warning message:
package ‘tidytext’ was built under R version 3.5.2 
Saving 7 x 7 in image
Rscript R/make_report.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Warning message:
package ‘rmarkdown’ was built under R version 3.5.2 


processing file: make_report.Rmd

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..........                                                            |  14%
   inline R code fragments


  |                                                                            
  |....................                                                  |  29%
label: setup (with options) 
List of 1
 $ include: logi FALSE


  |                                                                            
  |..............................                                        |  43%
  ordinary text without R code


  |                                                                            
  |........................................                              |  57%
label: about-data

  |                                                                            
  |..................................................                    |  71%
  ordinary text without R code


  |                                                                            
  |............................................................          |  86%
label: visualize-geospatial

Attaching package: 'rtweet'

The following object is masked from 'package:purrr':

    flatten


  |                                                                            
  |......................................................................| 100%
  ordinary text without R code


output file: make_report.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS make_report.utf8.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash+smart --output /Users/statkclee/swc/compendium/report/make_report.html --email-obfuscation none --self-contained --standalone --section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/h/default.html --highlight-style tango --number-sections --variable 'theme:bootstrap' --include-in-header /var/folders/g3/97168ry52ll6zfyl6ykyk3br0000gn/T//RtmpdQl4Fr/rmarkdown-str3b683cccac5.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/pagebreak.lua --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/latex-div.lua --variable code_folding=show --variable code_menu=1 

Output created: report/make_report.html
Warning messages:
1: package 'rtweet' was built under R version 3.5.2 
2: package 'leaflet' was built under R version 3.5.2 


processing file: make_report.Rmd

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..........                                                            |  14%
   inline R code fragments


  |                                                                            
  |....................                                                  |  29%
label: setup (with options) 
List of 1
 $ include: logi FALSE


  |                                                                            
  |..............................                                        |  43%
  ordinary text without R code


  |                                                                            
  |........................................                              |  57%
label: about-data

  |                                                                            
  |..................................................                    |  71%
  ordinary text without R code


  |                                                                            
  |............................................................          |  86%
label: visualize-geospatial

  |                                                                            
  |......................................................................| 100%
  ordinary text without R code


output file: make_report.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS make_report.utf8.md --to latex --from markdown+autolink_bare_uris+tex_math_single_backslash --output /Users/statkclee/swc/compendium/report/make_report.tex --self-contained --table-of-contents --toc-depth 2 --highlight-style tango --pdf-engine xelatex --variable graphics --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/pagebreak.lua --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/latex-div.lua --variable mainfont=NanumGothic --variable 'geometry:margin=1in' 

Output created: report/make_report.pdf
Warning message:
Package microtype Warning: Unknown slot number of character
(microtype)                `\'A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\~A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\"A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\r A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\k A'
(microtype)       [... truncated] 


processing file: make_report.Rmd

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..........                                                            |  14%
   inline R code fragments


  |                                                                            
  |....................                                                  |  29%
label: setup (with options) 
List of 1
 $ include: logi FALSE


  |                                                                            
  |..............................                                        |  43%
  ordinary text without R code


  |                                                                            
  |........................................                              |  57%
label: about-data

  |                                                                            
  |..................................................                    |  71%
  ordinary text without R code


  |                                                                            
  |............................................................          |  86%
label: visualize-geospatial

  |                                                                            
  |......................................................................| 100%
  ordinary text without R code


output file: make_report.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS make_report.utf8.md --to docx --from markdown+autolink_bare_uris+tex_math_single_backslash+smart --output /Users/statkclee/swc/compendium/report/make_report.docx --highlight-style tango --reference-doc ../assets/word_template.docx --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/pagebreak.lua 

Output created: report/make_report.docx
Warning message:
In hook_plot(f, options) :
  Chunk options fig.align is not supported for docx output

작업 결과를 다음 ls 명령어로 실행할 수 있다.

ls -al report/
total 11240
drwxr-xr-x  10 statkclee  staff      320 Jan 26 04:09 .
drwxr-xr-x  13 statkclee  staff      416 Jan 26 04:05 ..
-rw-r--r--@  1 statkclee  staff     6148 Jan 26 03:38 .DS_Store
-rw-r--r--   1 statkclee  staff     4261 Jan 26 03:36 make_report.Rmd
-rw-r--r--   1 statkclee  staff   646983 Jan 26 04:09 make_report.docx
-rw-r--r--   1 statkclee  staff  2321285 Jan 26 04:09 make_report.html
-rw-r--r--   1 statkclee  staff    76398 Jan 26 04:09 make_report.log
-rw-r--r--   1 statkclee  staff   599517 Jan 26 04:09 make_report.pdf
-rw-r--r--   1 statkclee  staff     1012 Jan 24 18:23 twitter_report.Rmd
-rw-r--r--@  1 statkclee  staff  1065230 Jan 24 18:23 twitter_report.html

3 파라미터화된 Makefile 제작 2

파라미터화된 Makefile 제작이 필요한데 이유는 해쉬태그(hashtag, #)를 매번 타이핑하는 것이 중복되는 면이 있어 이를 방지하고자 make all 명령어로 해쉬태그를 파라미터로 넘겨 이를 처리한다.

먼저 Makefile에 변수를 사용해서 넘길 파라미터를 미리 정의한다.

  • $(TOPIC): 해쉬태그
  • $(NUM_TWIT): 트윗 갯수
# parameterized Makefile
DATA = data
REPORT = report
TEMP = processed

all: report/make_report.html

## Get Twitter Data
$(DATA)/tw_dat.rds:
    Rscript R/get_data.R $(TOPIC) $(NUM_TWIT)

## Visualize Twitter Data
$(TEMP)/twit_trend.png: $(DATA)/tw_dat.rds
    Rscript R/graph_trend.R $(TOPIC)

$(TEMP)/user_activity.png: $(DATA)/tw_dat.rds
    Rscript R/analyze_activity.R

$(TEMP)/retweet_network.png: $(DATA)/tw_dat.rds
    Rscript R/visualize_retweets.R

$(TEMP)/bow_nlp.png: $(DATA)/tw_dat.rds
    Rscript R/visualize_nlp.R

## Make Reports 
$(REPORT)/make_report.html: $(DATA)/tw_dat.rds $(TEMP)/twit_trend.png $(TEMP)/user_activity.png $(TEMP)/retweet_network.png $(TEMP)/bow_nlp.png
    Rscript R/make_report.R

clean:
    rm -rf $(DATA)/tw_dat.rds 
    rm -rf $(TEMP)/twit_trend.png $(TEMP)/user_activity.png $(TEMP)/retweet_network.png $(TEMP)/bow_nlp.png
    rm -rf $(REPORT)/make_report.html $(REPORT)/make_report.pdf $(REPORT)/make_report.docx
    rm -rf $(REPORT)/*.tex $(REPORT)/*.log 

상기와 같이 파라미터 Makefile을 정의한 후 make 명령어에 변수=해쉬태그 형태로 넘긴다. 단, #을 Esccape 해야 되서 앞에 역슬래쉬와 ’#AI’와 같은 형태로 변수값을 조정한다.

make clean
make all TOPIC='\#AI'  NUM_TWIT=1000
rm -rf data/tw_dat.rds 
rm -rf processed/twit_trend.png processed/user_activity.png processed/retweet_network.png processed/bow_nlp.png
rm -rf report/make_report.html report/make_report.pdf report/make_report.docx
rm -rf report/*.tex report/*.log 
Rscript R/get_data.R \#AI 1000
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Rscript R/graph_trend.R \#AI
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Saving 7 x 7 in image
Rscript R/analyze_activity.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Saving 7 x 7 in image
Rscript R/visualize_retweets.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 

Attaching package: ‘igraph’

The following objects are masked from ‘package:dplyr’:

    as_data_frame, groups, union

The following objects are masked from ‘package:purrr’:

    compose, simplify

The following object is masked from ‘package:tidyr’:

    crossing

The following object is masked from ‘package:tibble’:

    as_data_frame

The following objects are masked from ‘package:stats’:

    decompose, spectrum

The following object is masked from ‘package:base’:

    union

Warning message:
package ‘igraph’ was built under R version 3.5.2 
null device 
          1 
Rscript R/visualize_nlp.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
Warning message:
package ‘rtweet’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter()  masks stats::filter()
x purrr::flatten() masks rtweet::flatten()
x dplyr::lag()     masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Warning message:
package ‘tidytext’ was built under R version 3.5.2 
Saving 7 x 7 in image
Rscript R/make_report.R
Warning messages:
1: package ‘ggplot2’ was built under R version 3.5.2 
2: package ‘ggthemes’ was built under R version 3.5.2 
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.0 ──
✓ tibble  2.1.3     ✓ dplyr   0.8.3
✓ tidyr   1.0.0     ✓ stringr 1.4.0
✓ readr   1.3.1     ✓ forcats 0.4.0
✓ purrr   0.3.3     
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
Warning messages:
1: package ‘tidyverse’ was built under R version 3.5.2 
2: package ‘tibble’ was built under R version 3.5.2 
3: package ‘tidyr’ was built under R version 3.5.2 
4: package ‘purrr’ was built under R version 3.5.2 
5: package ‘dplyr’ was built under R version 3.5.2 
6: package ‘stringr’ was built under R version 3.5.2 
7: package ‘forcats’ was built under R version 3.5.2 
Warning message:
package ‘rmarkdown’ was built under R version 3.5.2 


processing file: make_report.Rmd

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..........                                                            |  14%
   inline R code fragments


  |                                                                            
  |....................                                                  |  29%
label: setup (with options) 
List of 1
 $ include: logi FALSE


  |                                                                            
  |..............................                                        |  43%
  ordinary text without R code


  |                                                                            
  |........................................                              |  57%
label: about-data

  |                                                                            
  |..................................................                    |  71%
  ordinary text without R code


  |                                                                            
  |............................................................          |  86%
label: visualize-geospatial

Attaching package: 'rtweet'

The following object is masked from 'package:purrr':

    flatten


  |                                                                            
  |......................................................................| 100%
  ordinary text without R code


output file: make_report.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS make_report.utf8.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash+smart --output /Users/statkclee/swc/compendium/report/make_report.html --email-obfuscation none --self-contained --standalone --section-divs --table-of-contents --toc-depth 3 --variable toc_float=1 --variable toc_selectors=h1,h2,h3 --variable toc_collapsed=1 --variable toc_smooth_scroll=1 --variable toc_print=1 --template /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/h/default.html --highlight-style tango --number-sections --variable 'theme:bootstrap' --include-in-header /var/folders/g3/97168ry52ll6zfyl6ykyk3br0000gn/T//Rtmpgl7R1N/rmarkdown-str3c1310508625.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/pagebreak.lua --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/latex-div.lua --variable code_folding=show --variable code_menu=1 

Output created: report/make_report.html
Warning messages:
1: package 'rtweet' was built under R version 3.5.2 
2: package 'leaflet' was built under R version 3.5.2 


processing file: make_report.Rmd

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..........                                                            |  14%
   inline R code fragments


  |                                                                            
  |....................                                                  |  29%
label: setup (with options) 
List of 1
 $ include: logi FALSE


  |                                                                            
  |..............................                                        |  43%
  ordinary text without R code


  |                                                                            
  |........................................                              |  57%
label: about-data

  |                                                                            
  |..................................................                    |  71%
  ordinary text without R code


  |                                                                            
  |............................................................          |  86%
label: visualize-geospatial

  |                                                                            
  |......................................................................| 100%
  ordinary text without R code


output file: make_report.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS make_report.utf8.md --to latex --from markdown+autolink_bare_uris+tex_math_single_backslash --output /Users/statkclee/swc/compendium/report/make_report.tex --self-contained --table-of-contents --toc-depth 2 --highlight-style tango --pdf-engine xelatex --variable graphics --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/pagebreak.lua --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/latex-div.lua --variable mainfont=NanumGothic --variable 'geometry:margin=1in' 

Output created: report/make_report.pdf
Warning message:
Package microtype Warning: Unknown slot number of character
(microtype)                `\'A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\~A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\"A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\r A'
(microtype)                in font encoding `TU' in inheritance list
(microtype)                `microtype.cfg/376(protrusion)'.
Package microtype Warning: Unknown slot number of character
(microtype)                `\k A'
(microtype)       [... truncated] 


processing file: make_report.Rmd

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |..........                                                            |  14%
   inline R code fragments


  |                                                                            
  |....................                                                  |  29%
label: setup (with options) 
List of 1
 $ include: logi FALSE


  |                                                                            
  |..............................                                        |  43%
  ordinary text without R code


  |                                                                            
  |........................................                              |  57%
label: about-data

  |                                                                            
  |..................................................                    |  71%
  ordinary text without R code


  |                                                                            
  |............................................................          |  86%
label: visualize-geospatial

  |                                                                            
  |......................................................................| 100%
  ordinary text without R code


output file: make_report.knit.md

/usr/local/bin/pandoc +RTS -K512m -RTS make_report.utf8.md --to docx --from markdown+autolink_bare_uris+tex_math_single_backslash+smart --output /Users/statkclee/swc/compendium/report/make_report.docx --highlight-style tango --reference-doc ../assets/word_template.docx --lua-filter /Library/Frameworks/R.framework/Versions/3.5/Resources/library/rmarkdown/rmd/lua/pagebreak.lua 

Output created: report/make_report.docx
Warning message:
In hook_plot(f, options) :
  Chunk options fig.align is not supported for docx output

4 트위터 분석 추가

트위터 데이터를 가져와서 다양한 함수를 추가함으로써 텍스트, 네트워크, 시각화, 공간통계 등 분석을 추가할 수 있다. 분석된 스크립트는 R/ 디렉토리에 넣고, 함수 실행결과는 processed/ 디렉토리 내부에 그래프 이미지를 비롯한 각종 모형, 중간 데이터 파일 등을 저장시켜 둔다. 향후 추가 분석이 이뤄지는 경우 동일한 방식으로 R/ 디렉토리에 함수를 추가시키고 processed/ 디렉토리에 중간 산출물을 저장시킨다. 마지막으로 .Rmd 보고서 Literate Programming을 완성시킨다. 데이터 분석 결과물을 체계적으로 정리하고 나서 의사결정 및 데이터 분석을 통한 근거 내용을 차분하게 정리한다.

tree
.
├── Makefile
├── R
│   ├── analyze_activity.R
│   ├── get_data.R
│   ├── graph_trend.R
│   ├── make_report.R
│   ├── visualize_nlp.R
│   └── visualize_retweets.R
├── README.md
├── assets
│   └── word_template.docx
├── compendium.Rmd
├── compendium.html
├── data
│   ├── top_users_activity.rds
│   └── tw_dat.rds
├── processed
│   ├── bow_nlp.png
│   ├── retweet_network.png
│   ├── twit_trend.png
│   └── user_activity.png
└── report
    ├── make_report.Rmd
    ├── make_report.docx
    ├── make_report.html
    ├── make_report.log
    ├── make_report.pdf
    ├── twitter_report.Rmd
    └── twitter_report.html

5 directories, 24 files