2014-04-06 17 views
5

Sto provando a creare una previsione di 30 giorni utilizzando auto.arima dal pacchetto forecast. Voglio catturare la tendenza a lungo termine, quindi l'ho inserita nell'argomento xreg.Previsione con Arima automatico, con la linea di tendenza a lungo termine, la previsione di 30 giorni "salta"

I dati:

dput(data) 
structure(list(TKDate = structure(c(15706, 15707, 15708, 15709, 
15710, 15711, 15712, 15713, 15714, 15715, 15716, 15717, 15718, 
15719, 15720, 15721, 15722, 15723, 15724, 15725, 15726, 15727, 
15728, 15729, 15730, 15731, 15732, 15733, 15734, 15735, 15736, 
15737, 15738, 15739, 15740, 15741, 15742, 15743, 15744, 15745, 
15746, 15747, 15748, 15749, 15750, 15751, 15752, 15753, 15754, 
15755, 15756, 15757, 15758, 15759, 15760, 15761, 15762, 15763, 
15764, 15765, 15766, 15767, 15768, 15769, 15770, 15771, 15772, 
15773, 15774, 15775, 15776, 15777, 15778, 15779, 15780, 15781, 
15782, 15783, 15784, 15785, 15786, 15787, 15788, 15789, 15790, 
15791, 15792, 15793, 15794, 15795, 15796, 15797, 15798, 15799, 
15800, 15801, 15802, 15803, 15804, 15805, 15806, 15807, 15808, 
15809, 15810, 15811, 15812, 15813, 15814, 15815, 15816, 15817, 
15818, 15819, 15820, 15821, 15822, 15823, 15824, 15825, 15826, 
15827, 15828, 15829, 15830, 15831, 15832, 15833, 15834, 15835, 
15836, 15837, 15838, 15839, 15840, 15841, 15842, 15843, 15844, 
15845, 15846, 15847, 15848, 15849, 15850, 15851, 15852, 15853, 
15854, 15855, 15856, 15857, 15858, 15859, 15860, 15861, 15862, 
15863, 15864, 15865, 15866, 15867, 15868, 15869, 15870, 15871, 
15872, 15873, 15874, 15875, 15876, 15877, 15878, 15879, 15880, 
15881, 15882, 15883, 15884, 15885, 15886, 15887, 15888, 15889, 
15890, 15891, 15892, 15893, 15894, 15895, 15896, 15897, 15898, 
15899, 15900, 15901, 15902, 15903, 15904, 15905, 15906, 15907, 
15908, 15909, 15910, 15911, 15912, 15913, 15914, 15915, 15916, 
15917, 15918, 15919, 15920, 15921, 15922, 15923, 15924, 15925, 
15926, 15927, 15928, 15929, 15930, 15931, 15932, 15933, 15934, 
15935, 15936, 15937, 15938, 15939, 15940, 15941, 15942, 15943, 
15944, 15945, 15946, 15947, 15948, 15949, 15950, 15951, 15952, 
15953, 15954, 15955, 15956, 15957, 15958, 15959, 15960, 15961, 
15962, 15963, 15964, 15965, 15966, 15967, 15968, 15969, 15970, 
15971, 15972, 15973, 15974, 15975, 15976, 15977, 15978, 15979, 
15980, 15981, 15982, 15983, 15984, 15985, 15986, 15987, 15988, 
15989, 15990, 15991, 15992, 15993, 15994, 15995, 15996, 15997, 
15998, 15999, 16000, 16001, 16002, 16003, 16004, 16005, 16006, 
16007, 16008, 16009, 16010, 16011, 16012, 16013, 16014, 16015, 
16016, 16017, 16018, 16019, 16020, 16021, 16022, 16023, 16024, 
16025, 16026, 16027, 16028, 16029, 16030, 16031, 16032, 16033, 
16034, 16035, 16036, 16037, 16038, 16039, 16040, 16041, 16042, 
16043, 16044, 16045, 16046, 16047, 16048, 16049, 16050, 16051, 
16052, 16053, 16054, 16055, 16056, 16057, 16058, 16059, 16060, 
16061, 16062, 16063, 16064, 16065, 16066, 16067, 16068, 16069, 
16070, 16071, 16072, 16073, 16074, 16075, 16076, 16077, 16078, 
16079, 16080, 16081, 16082, 16083, 16084, 16085, 16086, 16087, 
16088, 16089, 16090, 16091, 16092, 16093, 16094, 16095, 16096, 
16097, 16098, 16099, 16100, 16101, 16102, 16103, 16104, 16105, 
16106, 16107, 16108, 16109, 16110, 16111, 16112, 16113, 16114, 
16115, 16116, 16117, 16118), class = "Date"), spend = c(7984.39, 
11476.06, 6555.57, 3981.45, 3963.83, 4827.72, 6309.32, 13503.36, 
17075.89, 33353.71, 29324.34, 7968.68, 5540.63, 12113.45, 15596.38, 
19328.67, 20224.68, 18977.55, 16128.27, 10633.56, 11887.79, 17881.11, 
12613.46, 11607.55, 38232.11, 7861.25, 9397.88, 12056.02, 15115.87, 
12275.93, 14537.35, 9594.26, 8215.83, 9632.52, 9993.15, 13478.37, 
28509.38, 12016.33, 8907.76, 8757.43, 9513.09, 10299.5, 10385.03, 
12515.62, 9008.95, 17825.68, 9320.47, 11189.58, 12902.31, 13341.35, 
18675.32, 16989.53, 10114.53, 9876.65, 11203.39, 11718.73, 26264.95, 
12414.19, 12275.16, 9242.85, 8883.97, 10095.72, 11581.55, 14815.78, 
25064.12, 9297.07, 8047.91, 6876.37, 8881.63, 10982.85, 9975.33, 
24124.62, 8514.66, 15719.84, 5807.39, 8422.38, 15184.95, 14757.58, 
11087.61, 11070.78, 10425.67, 15517.8, 11257.69, 11915.47, 11720.37, 
34064.62, 6493.41, 5757.4, 4387.54, 6520.58, 7806.81, 6356.63, 
10916.36, 9013.43, 9722.41, 6044.25, 7971.7, 23933.54, 8627.85, 
9722.77, 18660.13, 13011.36, 11445.11, 14219.2, 17138.92, 16016.68, 
11434, 31379.03, 8494.25, 12493.85, 7708.1, 21583.05, 9026.17, 
9379.35, 8287.13, 7298.16, 6097.03, 8076.57, 12871.87, 11346.89, 
9115.82, 7737.98, 15065.38, 5262.73, 6522.58, 12743.94, 23945.16, 
16109.26, 6985.89, 6345.08, 6246.93, 6824.66, 8491.42, 9654.99, 
18976.58, 19565.68, 8075.47, 7219.79, 8629.04, 12491.64, 11915.89, 
27533.16, 13554.35, 10102.21, 20029.15, 11641.82, 15855.19, 14139.17, 
15376.63, 14625.99, 9098.87, 9396.64, 12015.84, 17532.75, 15131.65, 
15815.5, 16048.65, 9769.63, 9582.12, 11201.8, 12810, 18857.38, 
11822.71, 19289.08, 8911.29, 9437.55, 10987.14, 12995.65, 16675.26, 
9741.82, 9723.57, 10328.24, 7738.04, 8432.16, 23021.73, 10367.28, 
8210.53, 10468.4, 8024.25, 7296.25, 7445.34, 8539.59, 12386.23, 
15335.72, 9013.49, 7994.95, 7759.46, 8789.38, 11242.38, 28653.23, 
9750.96, 14398.62, 9248.74, 6766.08, 8159.14, 9899.38, 9453.35, 
17588.96, 8958.16, 8256.61, 6240.4, 7235.24, 23841.62, 9002.73, 
11839.47, 8693.31, 7161.37, 7046.39, 9221.53, 10004.93, 8698.76, 
7948.68, 9013.27, 18536.68, 7980.38, 8968.95, 23594.14, 17744.66, 
12615.73, 13646.05, 10512.58, 9066.02, 9665.15, 13183.2, 23864.45, 
12017.52, 10831.07, 8954.76, 7276.41, 7882.9, 16616.41, 15384.68, 
11046.53, 10621.01, 8094.74, 5451.26, 6237.79, 10717.69, 7076.38, 
7044.62, 7047.45, 7774.77, 6496.21, 6340.9, 7110.53, 7691.28, 
17482.02, 5576.19, 3763.79, 11477.68, 5710.5, 6519.51, 20022.61, 
13153.68, 6526.28, 5885.28, 5656.17, 6270.04, 9795.38, 6320.95, 
5741.98, 10808.72, 5150.87, 5416.52, 6305.05, 20953.12, 6569.02, 
6360.21, 9376.68, 4973.93, 5034.48, 6380.45, 15307.28, 14386.65, 
17705.88, 4779.52, 4784.79, 4737.05, 5350.28, 12112.11, 13153.72, 
6049.69, 5430.46, 4627.59, 3637.2, 5482.43, 16705.15, 12221.16, 
13198.88, 6484.54, 5590.86, 4979.09, 5771.75, 7311.92, 16111.86, 
8047.77, 11706.91, 6042.14, 5670.74, 6905.07, 11261.89, 9700.4, 
6643.03, 5693.85, 14778.67, 9128.14, 3682.01, 7911.5, 17742.85, 
5093.31, 7867.97, 3202.78, 2843.35, 2598.77, 10930.81, 11204.67, 
7289.62, 4000.17, 4178.89, 4507.33, 6671.48, 10317.48, 9368.98, 
6156.41, 8375.24, 2762.76, 2457.59, 4707.51, 4584.52, 3749.82, 
11667.82, 4271.67, 3614.3, 3715.83, 4510.57, 4872.36, 21805.71, 
4757.04, 6515.92, 2834.25, 2685.19, 3509.28, 4479.35, 17817.99, 
10357.67, 3412.15, 3044.95, 2840.24, 3348.91, 13671.68, 2027.42, 
1616.25, 1177.73, 995.25, 1062.25, 1578.07, 1649.8, 1410.06, 
1592.03, 3995.24, 6489.87, 6895.21, 8298.58, 7698.68, 5782.07, 
7671.08, 19539.4, 7023.84, 6509.9, 6643.28, 19850.3, 6856.67, 
13142.15, 5524.75, 5063.2, 4916.81, 6117.54, 6717.86, 9393.95, 
10462.44, 10511.15, 4497.94, 4038.31, 5503.91, 5554.82, 5801.11, 
12992.82, 4778.61, 4067.41, 4359.53, 6148.1, 9236.51, 5773.16, 
11313.13, 4702.37, 4167.3, 4067.75, 4469.11, 9278.41, 9911.18, 
5161.13, 4477.78, 4459.53, 4080.14, 5084.67, 7735.34, 10676.6, 
5507.86, 8286.12, 4332.23, 4737.52, 5952.09, 7134.44)), .Names = c("TKDate", 
"spend"), row.names = c(NA, 413L), class = "data.frame") 

Il codice:

library(forecast) 
explaining<-rep(1:length(data$TKDate)) 
predic<-rep((length(data$TKDate)+1):(length(data$TKDate)+31)) 
modArima <- auto.arima(data[,2],xreg=explaining) 
fit<-forecast(modArima,h=30,xreg=explaining,newdata=predic) 
plot(fit) 

ottengo questo strano salto: enter image description here

Qualcuno può spiegare a me questo salto strano? Perché la previsione non continua dall'ultimo punto di dati osservato (o almeno vicino ad esso)?

risposta

4

Questo è un errore difficile da trovare, lo ammetto.

forecast.Arima() assume i nuovi valori dei regressori esterni non in un parametro newdata (come fa predict.lm()), ma nel parametro xreg. Così, invece di

fit <- forecast(modArima,h=30,xreg=explaining,newdata=predic) 

dove previsti insieme utilizzando i valori di explaining, non quelli di predic (purtroppo, forecast.Arima() non gettare un avviso se si alimentano i dati al parametro inesistente newdata), fare questo:

fit <- forecast(modArima,h=30,xreg=predic) 

e la trama (con in-campione si adatta gettato in buona misura - EDIT: un po 'confusamente, le crisi in-campione non vengono restituiti da auto.arima() o arima() come sono da lm(), ma per forecast.Arima()):

plot(fit) 
lines(fit$fitted,col="red") 

enter image description here

+2

devo dire che l'inserimento di nuovi dati in 'xreg' e non in' newdata' è smettere di confusione, soprattutto se il vostro usato per usare 'predict', grazie ancora! –