Environnement et packages¶

In [39]:
# activer l'environnement
using Pkg
Pkg.activate("env_julia_cairomakie")
  Activating project at `c:\Users\ricco\Desktop\demo\env_julia_cairomakie`
In [40]:
# liste des packages installés
Pkg.status()
Status `C:\Users\ricco\Desktop\demo\env_julia_cairomakie\Project.toml`
  [13f3f980] CairoMakie v0.15.11
  [a93c6f00] DataFrames v1.8.2
  [38e38edf] GLM v1.9.5
  [7073ff75] IJulia v1.34.4
  [fdbf4ff8] XLSX v0.11.10

Importation et inspection des données¶

In [41]:
# packages
import DataFrames as DFR
import XLSX

# lecture des données
df = DFR.DataFrame(XLSX.readtable("./data_crm.xlsx"))

# premières lignes
DFR.describe(df)
13×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64DataType
1Age41.562142.0700Int64
2Revenu46686.41800046385.5793180Int64
3ScoreCredit549.882300546.58500Int64
4Anciennete7.17805.0190Int64
5Montant10357.1300010369.0248780Int64
6Endettement22.22225.021.5566.10Float64
7SitFamilialeCelibataireMarie0String
8TypeContTravailCDDInterim0String
9SecteurTravailCommerceTech0String
10HistoIncidents0.40800.030Int64
11SituImmobilierNonOui0String
12CanalAcquisAgenceWeb0String
13CreditOKNonOui0String
In [42]:
# nombre d'observations
n = DFR.nrow(df)
println(n)
500

Graphiques - Une variable¶

In [43]:
# importation de la librairie + alias
import CairoMakie as CM

Distribution - Variable quantitative¶

In [44]:
# histogramme
CM.hist(df.Revenu)
No description has been provided for this image
In [45]:
# meilleure maîtrise des axes
# définir une Figure()
fig = CM.Figure()
ax = CM.Axis(fig[1,1],
            xticklabelsize = 10,
            xtickformat = "{:.0f}")
CM.hist!(ax,df.Revenu)# ! pour indiquer qu'on rajoute dans la figure
fig
No description has been provided for this image
In [46]:
# deux histogrammes côte à côte
# partage de la fenêtre d'affichage
fig = CM.Figure()
ax_1 = CM.Axis(fig[1,1], xticklabelsize = 10, xtickformat = "{:.0f}", title = "Revenu")
ax_2 = CM.Axis(fig[1,2], xticklabelsize = 10, xtickformat = "{:.2f}", title = "Endettement")

CM.hist!(ax_1,df.Revenu)
CM.hist!(ax_2,df.Endettement, color = :green)
fig
No description has been provided for this image
In [47]:
# density plot
CM.density(df.Revenu)
No description has been provided for this image
In [48]:
# boxplot
# nécessite forcément une catégorie
# on va feinter en créant une constante avec fill()
fig = CM.Figure()
ax = CM.Axis(fig[1,1], yticklabelsize = 10, ytickformat = "{:.0f}")
CM.boxplot!(ax,fill(1,n),df.Revenu)
fig
No description has been provided for this image
In [49]:
# violin
CM.violin(fill(1,n),df.Revenu)
No description has been provided for this image

Ligne¶

In [50]:
import Statistics
# ligne
fig = CM.Figure()
ax = CM.Axis(fig[1,1])
CM.lines!(ax,1:n,sort(df.ScoreCredit))
CM.hlines!([Statistics.median(df.ScoreCredit)],color=:green)
fig
No description has been provided for this image

Distribution - Variable qualitative¶

In [51]:
# barplot
# type d'emploi - comptabiliser
v = DFR.combine(DFR.groupby(df,:TypeContTravail),DFR.nrow => :effectif)
v
3×2 DataFrame
RowTypeContTravaileffectif
StringInt64
1CDI308
2Interim62
3CDD130
In [52]:
# afficher le barplot -> provoque une erreur
CM.barplot(v.TypeContTravail,v.effectif)
ArgumentError: 

    Conversion failed for Makie.BarPlot (With conversion trait Makie.PointBased()) with args:

        Tuple{Vector{String}, Vector{Int64}} 

    Got converted to: Tuple{Vector{String}, Vector{Int64}}

    Makie.BarPlot requires to convert to argument types Tuple{AbstractVector{<:Union{GeometryBasics.Point2, GeometryBasics.Point3}}}, which convert_arguments didn't succeed in.

    To fix this overload convert_arguments(P, args...) for Makie.BarPlot or Makie.PointBased() and return an object of type Tuple{AbstractVector{<:Union{GeometryBasics.Point2, GeometryBasics.Point3}}}.`





Stacktrace:

 [1] argument_error(PTrait::Makie.PointBased, P::Type, args::Tuple{Vector{String}, Vector{Int64}}, user_kw::Dict{Symbol, Any}, converted::Tuple{Vector{String}, Vector{Int64}})

   @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:724

 [2] (Makie.BarPlot)(user_args::Tuple{Vector{String}, Vector{Int64}}, user_attributes::Dict{Symbol, Any})

   @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:773

 [3] _create_plot(::Function, ::Dict{Symbol, Any}, ::Vector{String}, ::Vararg{Any})

   @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\figureplotting.jl:458

 [4] barplot(::Vector{String}, ::Vararg{Any}; kw::@Kwargs{})

   @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\recipes.jl:546

 [5] barplot(::Vector{String}, ::Vararg{Any})

   @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\recipes.jl:544

 [6] top-level scope

   @ c:\Users\ricco\Desktop\demo\jl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X25sZmlsZQ==.jl:2
In [53]:
# en fait, il faut utiliser une numérotation
# qui correspond à une position de chaque label
fig = CM.Figure()
ax = CM.Axis(fig[1,1],xticks=(1:DFR.nrow(v),v.TypeContTravail))
CM.barplot!(ax,1:DFR.nrow(v),v.effectif)
fig
No description has been provided for this image
In [54]:
# voir aussi pie
# faire apparaître les labels s'apparente à un chemin de croix
couleurs = [:red,:green,:blue]
fig, ax, plt = CM.pie(v.effectif,
                    color=couleurs,
                    label=[v.TypeContTravail[i] =>
                           (; color = c) for (i,c) in enumerate(couleurs)])
leg = CM.Legend(fig[1,2],ax)
fig
No description has been provided for this image

Graphiques - Deux ou plusieurs variables¶

Nuage de points et régression¶

In [55]:
# scatter plot
CM.scatter(df.Revenu, df.ScoreCredit)
No description has been provided for this image
In [56]:
# régression simple
import GLM
reg = GLM.lm(@GLM.formula(ScoreCredit ~ Revenu),df)
reg
StatsModels.TableRegressionModel{GLM.LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

ScoreCredit ~ 1 + Revenu

Coefficients:
──────────────────────────────────────────────────────────────────────────────────
                    Coef.   Std. Error      t  Pr(>|t|)    Lower 95%     Upper 95%
──────────────────────────────────────────────────────────────────────────────────
(Intercept)  163.715       8.61687      19.00    <1e-60  146.785      180.645
Revenu         0.00827151  0.000178807  46.26    <1e-99    0.0079202    0.00862282
──────────────────────────────────────────────────────────────────────────────────
In [57]:
# récupération des coefficients
coef = GLM.coef(reg)

# graphique avec la droite de régression
# coef[1] -> intercept
# coef[2] -> pente
fig = CM.Figure()
ax = CM.Axis(fig[1,1])
CM.scatter!(ax,df.Revenu, df.ScoreCredit,markersize=5,color=:gray)
CM.ablines!(coef[1],coef[2],color=:green,linewidth=3)
fig
No description has been provided for this image
In [58]:
# résidus de la régression
residus = df.ScoreCredit .- (coef[2] .* df.Revenu .+ coef[1])
residus
500-element Vector{Float64}:
  -26.06676234854865
   17.958659019122706
   82.82511678547291
  -35.087736739002594
 -106.78339641893342
  -37.838393356479514
   18.62830610364324
  -16.627174221447945
    8.654318699053533
  -36.916080663215325
    ⋮
   49.09502117179716
  -47.455261789385645
   87.81615472120325
  -30.63603175714354
  -14.234405644479466
  -90.10887856518855
  -48.28684422195295
  -75.33971699678727
   -9.9718518722633
In [59]:
# graphique des résidus
fig = CM.Figure()
ax = CM.Axis(fig[1,1])
CM.scatter!(df.ScoreCredit,residus,markersize=5,color=:gray)
CM.hlines!([0],linewidth=2,color=:blue)
fig
No description has been provided for this image

Autres nuages de points¶

In [60]:
# nuage de points conditionnellement à une variable
les_oui = (df.CreditOK .== "Oui")
les_non = (df.CreditOK .== "Non")

fig = CM.Figure()
ax = CM.Axis(fig[1,1])
CM.scatter!(df.Revenu[les_oui],df.ScoreCredit[les_oui],color=:blue)
CM.scatter!(df.Revenu[les_non],df.ScoreCredit[les_non],color=:red)
fig
No description has been provided for this image
In [61]:
# nuage de points avec taille de points variant selon Montant

# taille variant entre 0 et 1
montant = df.Montant
taille = (montant .- minimum(montant))/(maximum(montant) .- minimum(montant))

# graphique
CM.scatter(df.Revenu,df.ScoreCredit,markersize=taille*15)
No description has been provided for this image

Distribution conditionnelle¶

In [62]:
# boxplot conditionnel -> marche pas directement ????
CM.boxplot(df.CreditOK, df.Revenu)
Failed to resolve arg1:

[ComputeEdge] arg1 = compute_identity((outlier_points, ), changed, cached)

  @ C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:743

[ComputeEdge] outlier_points, outlier_indices, q1s, q5s = (::MapFunctionWrapper(#1372))((y, groups, quantiles, range, show_outliers, orientation, ), changed, cached)

  @ unknown method location

[ComputeEdge] x, y = (::MapFunctionWrapper(#_register_argument_conversions!##8))((converted, ), changed, cached)

  @ C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:533

  with edge inputs:

    converted = ((["Non", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui"  …  "Oui", "Non", "Oui", "Oui", "Non", "Non", "Non", "Non", "Oui", "Oui"], [63997, 41628, 51195, 57592, 57797, 48011, 50977, 54393, 41423, 55516  …  46698, 18762, 51108, 70056, 29852, 54950, 44175, 44801, 52303, 61084]),)

Triggered by update of:

  dim_convert_2, arg1, dim_convert_1 or arg2

Due to ERROR: Result needs to have same length. Found: ((["Non", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Non", "Non", "Oui", "Non", "Oui", "Non", "Non", "Non", "Non", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Non", "Non", "Non", "Non", "Oui", "Non", "Oui", "Non", "Non", "Non", "Non", "Non", "Non", "Non", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Non", "Non", "Non", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Non", "Non", "Non", "Non", "Oui", "Oui", "Non", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Non", "Non", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Non", "Non", "Non", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Non", "Non", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Non", "Non", "Non", "Oui", "Non", "Non", "Oui", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Oui", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Oui", "Non", "Oui", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Non", "Non", "Non", "Oui", "Non", "Non", "Non", "Non", "Oui", "Non", "Non", "Oui", "Non", "Non", "Non", "Non", "Non", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Non", "Non", "Oui", "Oui", "Non", "Non", "Oui", "Non", "Oui", "Oui", "Oui", "Non", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Non", "Oui", "Oui", "Oui", "Oui", "Non", "Oui", "Non", "Oui", "Non", "Oui", "Oui", "Non", "Non", "Non", "Non", "Oui", "Oui"], [63997, 41628, 51195, 57592, 57797, 48011, 50977, 54393, 41423, 55516, 59556, 38374, 36915, 53743, 62781, 32097, 56194, 46120, 49167, 63628, 44796, 57987, 29704, 44142, 71675, 38517, 62515, 34983, 38316, 38821, 33515, 19695, 42811, 43832, 62421, 30726, 29015, 56506, 26193, 46309, 55307, 50419, 53927, 42267, 55025, 65610, 53248, 35868, 38701, 58942, 32619, 25132, 50559, 50572, 42916, 55814, 52771, 36552, 34074, 65035, 52680, 60650, 72667, 42545, 45512, 42283, 37728, 33742, 52583, 51240, 45601, 51816, 29006, 61802, 30780, 44893, 46029, 46659, 43435, 27882, 29365, 43088, 45697, 50664, 35215, 22191, 59245, 42948, 51690, 42250, 44010, 51987, 56921, 38711, 56491, 54330, 46547, 56796, 47943, 33109, 50496, 32175, 33707, 45504, 52404, 53363, 43198, 46580, 61719, 57598, 29796, 46237, 24794, 35511, 42026, 27061, 64062, 42431, 20580, 46827, 28893, 37346, 37266, 28653, 56320, 41740, 27267, 48337, 39490, 59498, 37490, 41712, 49062, 39345, 62419, 54976, 42125, 39170, 39725, 59299, 38996, 70309, 33648, 40328, 50971, 59914, 30943, 42631, 44662, 32466, 59029, 66097, 59760, 33605, 48867, 59878, 58904, 45628, 52870, 36050, 44047, 44752, 19318, 76484, 50455, 59679, 69345, 53096, 41578, 56440, 28001, 33789, 44279, 46656, 51138, 33033, 39206, 48746, 51293, 67370, 46284, 69715, 37212, 41395, 36876, 43347, 25469, 35501, 60709, 73777, 40168, 52720, 36895, 56361, 35580, 70390, 27152, 24376, 50442, 61262, 41388, 38511, 62260, 36337, 28746, 52036, 49857, 55489, 31979, 30980, 62249, 52595, 47133, 54797, 47333, 42308, 53644, 49394, 65966, 46747, 35170, 47873, 49454, 50317, 30241, 41749, 31098, 44416, 49635, 36181, 41129, 32348, 36683, 33683, 24275, 42304, 62433, 54127, 48700, 45490, 44709, 58879, 52108, 65875, 26689, 64023, 62610, 35242, 59245, 41312, 48501, 64623, 45243, 23064, 45955, 50788, 78403, 53603, 36806, 42627, 74491, 39988, 62629, 42201, 55020, 48824, 42996, 44843, 58627, 27586, 55303, 48617, 37085, 48795, 57758, 53572, 47882, 43118, 37223, 71638, 30401, 36149, 41366, 45437, 28462, 18247, 49712, 44709, 46409, 53657, 73448, 38992, 69332, 30416, 53448, 45313, 62714, 32631, 43011, 29496, 66890, 67092, 48047, 42651, 47001, 48622, 44943, 55499, 43927, 39805, 59582, 57456, 18000, 41407, 31545, 34911, 38253, 33918, 63840, 49076, 53518, 49614, 49894, 51386, 42203, 54093, 51179, 42566, 54589, 26501, 34973, 54789, 40713, 32604, 30777, 66171, 19832, 46362, 68263, 40440, 39049, 63214, 50463, 71123, 36108, 51586, 43645, 56972, 18000, 33712, 52201, 41224, 62838, 51217, 38162, 55412, 47827, 52972, 37188, 37860, 62840, 50231, 53277, 34617, 57569, 72544, 55791, 51887, 60813, 42951, 20897, 33786, 53976, 35698, 57619, 52773, 41123, 40167, 52076, 44646, 51116, 39536, 49334, 79318, 52639, 54783, 45532, 49477, 41585, 47665, 50063, 41403, 60140, 63305, 32635, 61218, 48035, 49492, 41338, 53367, 49869, 44536, 48412, 58080, 59158, 33725, 46313, 52590, 56177, 52456, 29822, 59528, 39972, 46819, 29327, 40916, 61295, 45853, 40430, 52759, 39116, 39091, 62539, 41744, 41511, 54113, 60666, 59050, 34212, 44956, 68307, 64043, 45475, 40454, 30353, 61605, 54900, 45947, 54505, 64293, 51123, 33465, 25247, 42941, 67860, 40250, 48670, 38577, 42875, 48867, 37627, 40874, 37383, 58087, 33403, 32694, 36113, 36629, 46532, 46929, 46170, 60046, 47120, 34639, 54189, 46458, 32487, 42957, 35261, 39864, 74343, 36791, 53557, 64771, 54167, 47433, 49463, 62806, 49570, 26357, 33676, 70447, 44270, 34079, 56013, 52941, 64366, 28280, 30477, 31714, 46698, 18762, 51108, 70056, 29852, 54950, 44175, 44801, 52303, 61084]),), for func C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:1017



Stacktrace:

  [1] error(s::String)

    @ Base .\error.jl:44

  [2] ComputePipeline.TypedEdge(edge::ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}, f::ComputePipeline.MapFunctionWrapper{false, Makie.var"#_register_argument_conversions!##8#_register_argument_conversions!##9"}, inputs::@NamedTuple{converted::Base.RefValue{Tuple{Tuple{Vector{String}, Vector{Int64}}}}})

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:140

  [3] ComputePipeline.TypedEdge(edge::ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph})

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:120

  [4] (::ComputePipeline.var"#resolve!##4#resolve!##5"{ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}})()

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:670

  [5] lock(f::ComputePipeline.var"#resolve!##4#resolve!##5"{ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}}, l::ReentrantLock)

    @ Base .\lock.jl:335

  [6] resolve!(edge::ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph})

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:665

  [7] _resolve!(computed::ComputePipeline.Computed)

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:658

  [8] foreach

    @ .\abstractarray.jl:3188 [inlined]

  [9] (::ComputePipeline.var"#resolve!##4#resolve!##5"{ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}})()

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:667

 [10] lock(f::ComputePipeline.var"#resolve!##4#resolve!##5"{ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}}, l::ReentrantLock)

    @ Base .\lock.jl:335

 [11] resolve!(edge::ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph})

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:665

 [12] _resolve!(computed::ComputePipeline.Computed)

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:658

 [13] foreach

    @ .\abstractarray.jl:3188 [inlined]

 [14] (::ComputePipeline.var"#resolve!##4#resolve!##5"{ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}})()

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:667

 [15] lock(f::ComputePipeline.var"#resolve!##4#resolve!##5"{ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph}}, l::ReentrantLock)

    @ Base .\lock.jl:335

 [16] resolve!(edge::ComputePipeline.ComputeEdge{ComputePipeline.ComputeGraph})

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:665

 [17] _resolve!(computed::ComputePipeline.Computed)

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:658

 [18] resolve!(computed::ComputePipeline.Computed)

    @ ComputePipeline C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:650

 [19] getindex

    @ C:\Users\ricco\.julia\packages\ComputePipeline\E2l50\src\ComputePipeline.jl:563 [inlined]

 [20] #_register_expand_arguments!##0

    @ C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:399 [inlined]

 [21] iterate

    @ .\generator.jl:48 [inlined]

 [22] _collect(c::Vector{Symbol}, itr::Base.Generator{Vector{Symbol}, Makie.var"#_register_expand_arguments!##0#_register_expand_arguments!##1"{ComputePipeline.ComputeGraph}}, ::Base.EltypeUnknown, isz::Base.HasShape{1})

    @ Base .\array.jl:810

 [23] collect_similar

    @ .\array.jl:732 [inlined]

 [24] map

    @ .\abstractarray.jl:3372 [inlined]

 [25] _register_expand_arguments!(::Type{Makie.Scatter}, attr::ComputePipeline.ComputeGraph, inputs::Vector{Symbol}, is_merged::Bool)

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:399

 [26] _register_expand_arguments!

    @ C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:395 [inlined]

 [27] register_arguments!

    @ C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:373 [inlined]

 [28] (Makie.Scatter)(user_args::Tuple{ComputePipeline.Computed}, user_attributes::Dict{Symbol, Any})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:769

 [29] _create_plot!(F::Function, attributes::Dict{Symbol, Any}, scene::Makie.BoxPlot{Tuple{Tuple{Vector{String}, Vector{Int64}}}}, args::ComputePipeline.Computed)

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\figureplotting.jl:552

 [30] scatter!(::Makie.BoxPlot{Tuple{Tuple{Vector{String}, Vector{Int64}}}}, ::Vararg{Any}; kw::@Kwargs{color::ComputePipeline.Computed, marker::ComputePipeline.Computed, markersize::ComputePipeline.Computed, strokecolor::ComputePipeline.Computed, strokewidth::ComputePipeline.Computed, inspectable::ComputePipeline.Computed, colorrange::Makie.Automatic, visible::ComputePipeline.Computed})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\recipes.jl:550

 [31] plot!(plot::Makie.BoxPlot{Tuple{Tuple{Vector{String}, Vector{Int64}}}})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\stats\boxplot.jl:195

 [32] connect_plot!(parent::Makie.Scene, plot::Makie.BoxPlot{Tuple{Tuple{Vector{String}, Vector{Int64}}}})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\compute-plots.jl:843

 [33] plot!

    @ C:\Users\ricco\.julia\packages\Makie\WKgwk\src\interfaces.jl:211 [inlined]

 [34] plot!(ax::Makie.Axis, plot::Makie.BoxPlot{Tuple{Tuple{Vector{String}, Vector{Int64}}}})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\figureplotting.jl:573

 [35] plot!(fa::Makie.FigureAxis, plot::Makie.BoxPlot{Tuple{Tuple{Vector{String}, Vector{Int64}}}})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\figureplotting.jl:569

 [36] _create_plot(::Function, ::Dict{Symbol, Any}, ::Vector{String}, ::Vararg{Any})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\figureplotting.jl:460

 [37] boxplot(::Vector{String}, ::Vararg{Any}; kw::@Kwargs{})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\recipes.jl:546

 [38] boxplot(::Vector{String}, ::Vararg{Any})

    @ Makie C:\Users\ricco\.julia\packages\Makie\WKgwk\src\recipes.jl:544

 [39] top-level scope

    @ c:\Users\ricco\Desktop\demo\jl_notebook_cell_df34fa98e69747e1a8f8a730347b8e2f_X45sZmlsZQ==.jl:2
In [63]:
# boxplot revenu vs. acceptation => if faut coder nous-même
# valeurs uniques
labels = sort(unique(df.CreditOK))
println(labels)

# associer un code aux valeurs
codes = Int.(indexin(df.CreditOK,labels))
println(codes)

# puis la figure
fig = CM.Figure()
ax = CM.Axis(fig[1,1],xlabel="Credit OK",xticks=(1:length(labels),labels))
CM.boxplot!(ax,codes, df.Revenu)
fig
["Non", "Oui"]
[1, 1, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 2, 2, 1, 2, 2, 1, 2, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 2, 1, 1, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 1, 2, 2, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 1, 2, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 1, 2, 1, 2, 1, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 1, 1, 1, 1, 2, 2, 1, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 2, 1, 1, 2, 1, 2, 1, 1, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 1, 2, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 1, 1, 1, 2, 2, 2, 2, 1, 2, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 1, 2, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 2, 1, 2, 2, 2, 1, 1, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 1, 2, 1, 2, 1, 2, 2, 1, 1, 1, 1, 2, 2]
No description has been provided for this image
In [64]:
# même chose pour violin
fig = CM.Figure()
ax = CM.Axis(fig[1,1],xlabel="Credit OK",xticks=(1:length(labels),labels))
CM.violin!(ax,codes, df.Revenu)
fig
No description has been provided for this image
In [65]:
# densité conditionnelle
fig = CM.Figure()
ax = CM.Axis(fig[1,1])
# densité chez les "Oui"
CM.density!(ax,df.Revenu[df.CreditOK .== "Oui"])
# avec un peu de transparence (alpha) -> les "Non"
CM.density!(ax,df.Revenu[df.CreditOK .== "Non"],alpha=0.4,color=:green)
fig
No description has been provided for this image
In [66]:
# on pourrait le voir différemment aussi
min_rev = minimum(df.Revenu)
max_rev = maximum(df.Revenu)

# histogramme
fig = CM.Figure()
ax_1 = CM.Axis(fig[1,1],title="Oui")
ax_2 = CM.Axis(fig[2,1],title="Non")
CM.xlims!(ax_1,min_rev,max_rev)
CM.xlims!(ax_2,min_rev,max_rev)
CM.hist!(ax_1,df.Revenu[df.CreditOK .== "Oui"])
CM.hist!(ax_2,df.Revenu[df.CreditOK .== "Non"],color=:green)
fig
No description has been provided for this image
In [67]:
# décalage entre fonctions de distribution
fig = CM.Figure()
ax = CM.Axis(fig[1,1],title="Empirical Cumulative Distribution Function")
CM.ecdfplot!(ax,df.Revenu[df.CreditOK .== "Oui"],color=:blue,label="Oui")
CM.ecdfplot!(ax,df.Revenu[df.CreditOK .== "Non"],color=:green,label="Non")
CM.Legend(fig[1,2],ax)
fig
No description has been provided for this image
In [68]:
# médianes
med_oui = Statistics.median(df.Revenu[df.CreditOK .== "Oui"])
med_non = Statistics.median(df.Revenu[df.CreditOK .== "Non"])
println("Mediane(oui) = $med_oui ; Mediane(non) = $med_non")
Mediane(oui) = 53172.0 ; Mediane(non) = 39765.0
In [69]:
# autre manière de comparer les distributions empiriques
# QQPlot
fig = CM.Figure()
ax = CM.Axis(fig[1,1])
CM.xlims!(ax,min_rev,max_rev)
CM.ylims!(ax,min_rev,max_rev)
CM.qqplot!(ax,df.Revenu[df.CreditOK .== "Oui"],df.Revenu[df.CreditOK .== "Non"])
CM.scatter!([med_oui],[med_non],color=:red,markersize=25,marker=:cross)
CM.ablines!(0,1,color=:green)
fig
No description has been provided for this image

Heatmap (1) - Corrélations¶

In [70]:
# variables numériques
X = DFR.select(df,names(df,DFR.Number))
names(X)
7-element Vector{String}:
 "Age"
 "Revenu"
 "ScoreCredit"
 "Anciennete"
 "Montant"
 "Endettement"
 "HistoIncidents"
In [71]:
# calculer la matrice des corrélations
cor_mat = Statistics.cor(Matrix(X))
cor_mat
7×7 Matrix{Float64}:
  1.0        -0.0809775  -0.0922087  -0.0766463  …   0.0424002  -0.0506387
 -0.0809775   1.0         0.900675    0.103156      -0.0546184   0.0194893
 -0.0922087   0.900675    1.0         0.347076      -0.0345155  -0.249708
 -0.0766463   0.103156    0.347076    1.0            0.0262538   0.0177185
 -0.0142236   0.547024    0.488969    0.0508609      0.695975    0.00864319
  0.0424002  -0.0546184  -0.0345155   0.0262538  …   1.0        -0.0422676
 -0.0506387   0.0194893  -0.249708    0.0177185     -0.0422676   1.0
In [72]:
# nombre de variables
p = size(cor_mat)[1]
p
7
In [73]:
# heatmap
fig = CM.Figure()
ax = CM.Axis(fig[1, 1],
            xticks = (1:p,names(X)),
            yticks = (1:p,names(X)),
            xticklabelrotation = π / 2,
            yreversed=true) #inverser les lignes !!!

echelle = CM.heatmap!(ax, cor_mat, colormap = :coolwarm, colorrange=(-1,+1))
CM.Colorbar(fig[1,2],echelle)
fig
No description has been provided for this image

Heatmap (2) - Moyennes conditionnelles¶

In [74]:
# moyenne de score credit en fonction de 
# situation familiale et type de contrat de travail
res = DFR.combine(DFR.groupby(df,[:SitFamiliale,:SecteurTravail]),:Revenu => Statistics.mean => :RevenuMoyen)
res
15×3 DataFrame
RowSitFamilialeSecteurTravailRevenuMoyen
StringStringFloat64
1MarieFinance61761.8
2MarieTech50050.0
3CelibataireIndustrie41829.0
4DivorceFinance62541.7
5CelibataireSante46546.1
6CelibataireFinance59360.1
7CelibataireCommerce34074.0
8MarieIndustrie40676.4
9CelibataireTech50042.3
10DivorceTech53098.1
11DivorceCommerce39013.5
12MarieSante45476.0
13MarieCommerce33392.7
14DivorceIndustrie45437.0
15DivorceSante46028.0
In [75]:
# sous la forme d'un tableau croisé
res_tab = DFR.unstack(res,:SitFamiliale,:SecteurTravail,:RevenuMoyen)
res_tab
3×6 DataFrame
RowSitFamilialeFinanceTechIndustrieSanteCommerce
StringFloat64?Float64?Float64?Float64?Float64?
1Marie61761.850050.040676.445476.033392.7
2Celibataire59360.150042.341829.046546.134074.0
3Divorce62541.753098.145437.046028.039013.5
In [76]:
# heatmap
fig = CM.Figure()

ax = CM.Axis(fig[1, 1],
            xticks = (1:(DFR.ncol(res_tab)-1),names(res_tab)[2:end]),
            yticks = (1:DFR.nrow(res_tab),res_tab.SitFamiliale),
            title = "Revenu moyen",
            yreversed=true) #inverser les lignes !!!
#tableau transformé en Matrix et transposé !!!
echelle = CM.heatmap!(ax, Matrix(res_tab[1:end,2:end])', colormap = :Blues)
CM.Colorbar(fig[1,2],echelle)
fig
No description has been provided for this image