ludwig
ludwig.ai 웹사이트에 Ludwig 의 딥러닝에 대한 자세한 사항을 확인할 수 있다.
즉 선언적인 딥러닝 프레임워크로 그래프 문법(ggplot)에 익숙하면 수월하게 활용할 수 있다.
평론가는 관람한 영화가 맘에 들면 신선한 토마토를, 그렇지 않다면 썩은 토마토(rotten tomato)를 선택하는 데 지수가 높을수록 추천하는 평론가가 많다는 것을 의미하는데… 국내에서는 “썩토지수”라로 많이 알려져 있다.
Ludwig는 (비)정형 데이터를 모두 다룰 수 있지만 썩은 토마토 데이터셋을 가지고 새로 개발되고 있는 딥러닝 모형을 개발해보자. 다음 코드는 Ludwig Getting Stated에서 가져왔다.
# !pip install ludwig --user
import pandas as pd
from ludwig.api import LudwigModel
= pd.read_csv('ludwig/rotten_tomatoes.csv')
df
= LudwigModel(config='ludwig/rotten_tomatoes.yaml')
model = model.train(dataset=df)
results # Lock 1420789236640 acquired on C:\swc\.lock_preprocessing
# Lock 1420789236640 released on C:\swc\.lock_preprocessing
딥러닝 모형에 다소 시간이 걸렸는데… 예측모형은 저장되면 이를 가져와서 inference 딥러닝 모형으로 예측이 가능하다.
import pandas as pd
from ludwig.api import LudwigModel
C:\Users\STATKC~1\ANACON~1\lib\site-packages\dask\dataframe\utils.py:369: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
_numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)
C:\Users\STATKC~1\ANACON~1\lib\site-packages\dask\dataframe\utils.py:369: FutureWarning: pandas.Float64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
_numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)
C:\Users\STATKC~1\ANACON~1\lib\site-packages\dask\dataframe\utils.py:369: FutureWarning: pandas.UInt64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
_numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)
<frozen importlib._bootstrap>:219: RuntimeWarning: scipy._lib.messagestream.MessageStream size changed, may indicate binary incompatibility. Expected 56 from C header, got 64 from PyObject
C:\Users\statkclee\AppData\Roaming\Python\Python38\site-packages\torchaudio\backend\utils.py:62: UserWarning: No audio backend is available.
warnings.warn("No audio backend is available.")
= LudwigModel.load('results/api_experiment_run/model') movie_model
ray.init() failed: Could not find any running Ray instance. Please specify the one to connect to by setting `--address` flag or `RAY_ADDRESS` environment variable.
= movie_model.predict(dataset='ludwig/rotten_tomatoes_test.csv')
predictions, _ predictions.head()
recommended_probabilities ... recommended_probability
0 [0.10894948, 0.8910505] ... 0.891051
1 [0.20457983, 0.79542017] ... 0.795420
2 [0.0067676306, 0.99323237] ... 0.993232
3 [0.122318566, 0.87768143] ... 0.877681
4 [0.44897103, 0.55102897] ... 0.551029
[5 rows x 5 columns]
예측에 대한 자세한 사항을 살펴보자.
library(reticulate)
library(tidyverse)
<- read_csv('ludwig/rotten_tomatoes_test.csv')
rt_csv
$predictions %>%
py::clean_names() %>%
janitorbind_cols(rt_csv) %>%
select(movie_title,
review_content ,
recommended_probabilities,
recommended_predictions,%>%
recommended_probability) ::kable() knitr
movie_title | review_content | recommended_probabilities | recommended_predictions | recommended_probability |
---|---|---|---|---|
It | … “It” is terrifically entertaining. | 0.1089495, 0.8910505 | TRUE | 0.8910505 |
Talk to Her | There is much to admire in Almodvar’s technical proficiency, but his quirky movies make little emotional impact. | 0.2045798, 0.7954202 | TRUE | 0.7954202 |
Suspiria | As Blanc, Swinton glides as if her feet never touch the ground… Her Josef is, by contrast, the film’s most moving element. | 0.006767631, 0.993232369 | TRUE | 0.9932324 |
The Road To Guantanamo | The material is beautifully put together, and it is powerful. | 0.1223186, 0.8776814 | TRUE | 0.8776814 |
Den of Thieves | At 140 minutes the film itself overstays its welcome, it is not half as clever as it clearly thinks it is and women are strictly optional extras. | 0.448971, 0.551029 | TRUE | 0.5510290 |
The Broken Circle Breakdown | Belgium’s Oscar entry is a shattering tale about the death of a six-year-old and its effects on family. Terrific bluegrass music. | 0.03033006, 0.96966994 | TRUE | 0.9696699 |
3:10 to Yuma | Unapologetically harsh and heedlessly entertaining despite its imperfections, the film includes two masterful performances from Crowe and Bale - actors so intense, they could wear red and intimidate a charging bull. | 0.01022094, 0.98977906 | TRUE | 0.9897791 |
Operation Finale | The sparring between Kingsley and Isaac is remarkable and the film’s structural flaws never blunt the the touching impact of its themes. | 0.0173105, 0.9826895 | TRUE | 0.9826895 |
Friends With Money | The cast is terrific, the movie isn’t. | 0.4993716, 0.5006284 | TRUE | 0.5006284 |
Final Destination 5 | A long and eventually tedious series of deaths, all in slightly sickening 3-D. Splattered eyeballs, snapped spines, heart kebabs - one numbingly after another, in diamond-hard focus and ruby-red color. | 0.97655839, 0.02344162 | FALSE | 0.9765584 |