The concentrations of sulfate, black carbon (BC) and other aerosols in the Arctic are characterized by high values in late winter and spring (so-called Arctic Haze) and low values in summer. Models have long been struggling to capture this seasonality and especially the high concentrations associated with Arctic Haze. In this study, we evaluate sulfate and BC concentrations from eleven different models driven with the same emission inventory against a comprehensive pan-Arctic measurement data set over a time period of 2 years (2008–2009). The set of models consisted of one Lagrangian particle dispersion model, four chemistry transport models (CTMs), one atmospheric chemistry-weather forecast model and five chemistry climate models (CCMs), of which two were nudged to meteorological analyses and three were running freely. The measurement data set consisted of surface measurements of equivalent BC (eBC) from five stations (Alert, Barrow, Pallas, Tiksi and Zeppelin), elemental carbon (EC) from Station Nord and Alert and aircraft measurements of refractory BC (rBC) from six different campaigns. We find that the models generally captured the measured eBC or rBC and sulfate concentrations quite well, compared to previous comparisons. However, the aerosol seasonality at the surface is still too weak in most models. Concentrations of eBC and sulfate averaged over three surface sites are underestimated in winter/spring in all but one model (model means for January–March underestimated by 59 and 37% for BC and sulfate, respectively), whereas concentrations in summer are overestimated in the model mean (by 88 and 44% for July–September), but with overestimates as well as underestimates present in individual models. The most pronounced eBC underestimates, not included in the above multi-site average, are found for the station Tiksi in Siberia where the measured annual mean eBC concentration is 3 times higher than the average annual mean for all other stations. This suggests an underestimate of BC sources in Russia in the emission inventory used. Based on the campaign data, biomass burning was identified as another cause of the modeling problems. For sulfate, very large differences were found in the model ensemble, with an apparent anticorrelation between modeled surface concentrations and total atmospheric columns. There is a strong correlation between observed sulfate and eBC concentrations with consistent sulfate/eBC slopes found for all Arctic stations, indicating that the sources contributing to sulfate and BC are similar throughout the Arctic and that the aerosols are internally mixed and undergo similar removal. However, only three models reproduced this finding, whereas sulfate and BC are weakly correlated in the other models. Overall, no class of models (e.g., CTMs, CCMs) performed better than the others and differences are independent of model resolution.