Metacritic in 2014

Shortbread

Island Hopper
Legend
Only three original console games averaged 90% or more on Metacritic in 2014

The highest scoring original game was Wii U title Super Smash Bros with 92 per cent, but even that it down on previous winners such as Grand Theft Auto V (97%, 2013), The Walking Dead (95%, 2012), Batman Arkham City (96%, 2011) and Super Mario Galaxy 2 (97%, 2010).

PS4 on average had the best rated games (72%) with five or more at 90%+ and a further four with 75%+. Xbox One was second with an average score of 72.4 per cent, although it had only one 90%+ rated title and six 75%+ games. Wii U came third with a 66.8 per cent average followed by PC (70.4%) although that had a whopping 45 games scoring 75 per cent or more.

Forza Horizon 2 (86%) was Xbox One’s highest rated exclusive title.

The titles in Metacritic’s 2014 90 per cent club are:
1. Grand Theft Auto V (Xbox One) – 97%
2. Grand Theft Auto V (PS4) – 96%
3. The Last of Us: Remastered (PS4) – 95%
4. Super Smash Bros (Wii U) – 92%
5. Dark Souls II (360) – 91%
6. Bayonetta 2 (Wii U) – 91%
7. Dark Souls II (PC) – 91%
8. Dark Souls II (PS3) – 91%
9. Diablo III: Ultimate Evil Edition (PS4) – 90%
10. Rayman Legends (PS4) – 90%
11. Velocity 2X (Vita) – 90%
12. Fez (PS4) – 90%
13. Shovel Knight (3DS) – 90%
14. Guacamelee! Super Turbo Championship Edition (Wii U) – 90%

The year’s biggest critical flops included Activision trio The Amazing Spider-Man 2 (46%, Xbox One), Transformers: Rise of the Dark Spark (43%, PS4) and The Legend of Korra (49%, Xbox One) as well as Reef’s Rambo: The Video Game (28%, 360), Deep Silver’s Escape Dead Island (35%, 360) and Risen 3: Titan lords (36%, PS3), Sega’s Sonic Boom: Rise of Lyric (32%, Wii U) and Sonic Boom: Shattered Crystal (46%, 3DS), System 3’s Putty Squad (38%, PS4) and Bandai Namco’s Tenkai Knights: Brave Battle (26%, 3DS)

Thoughts?

FYI: If I'm not mistaken, PS4 GTA V has been holding on the number one spot in the UK since its release.
 
<insert stuff about metacritic being evil and whatnot>
Anyway I find it useful, but the metric you used % of games above threshold is rather useless to me, I'm more interested in number of games above threshold and comparing those between platforms...
(Because I obviously buy a number of games not a % of released games ^^)
I use it as a guideline for further investigation, higher % meaning better chance to also like because. I find user score very interesting when compared to press score. (And press score useless w/o user score in fact.)
[I use a weighted average and convert to a linear scale to get something more meaningful to me.]
 
I'd say metacritic scores in general tell the correct tale. It's not been the best year in game biz but rather typical new gen year. 2006 also got low number of 90+ reviews. Death of mid range games and huge number of indie games also mix things up in this generation. The gap between AAA and indie is huge. But if you happen to tolerate low production values of the most indie games, it just might be that this year was best year in games... ever. I can't so it was rather average for me.

I don't even have best game for this year. I've played plenty of good games, but nothing great. I expect that to change next year if GTA V is finally released for the PC.
 
Consensus is a blessing and a curse.

On the one hand, consensus allows people to avoid software turds, but on the other hand it allows marketing departments to manipulate opinion into giving a generic-as-fuck game 93%.

Metacritic is the equivalent of a large group of semi-independent nubbins being manipulated into rubbling their tiny little acorns together to come up with a commercially and publicly acceptable score, with the knowledge that said [PR manipulated] score will affect sales and public evaluation of their worth as whores.

In short: fuck Metacritic; long live Metarcritic. If devs can't force high ratings for big budget games, then how are we going to continue to get them?
 
Does high metacrit score translate into high sales?
I'm looking at Fez in that list so no, I don't believe it does.

It does translate directly into how much the developers get paid, though.



I think Metacritic should have some decent competition and become less important.
No one knows how (in)corruptible they can be, and their lack of transparency as to how they calculate their reviewer score is a big problem to me.
 
I think Metacritic should have some decent competition and become less important.
No one knows how (in)corruptible they can be, and their lack of transparency as to how they calculate their reviewer score is a big problem to me.
I thought it was just a mean average of all review scores they found.

It's certainly not clever, because they have exactly the same game scoring different scores on different systems. 1% difference on GTAV for example is just statistical noise where platform specific coverage is shifting the result for one machine. If the score was truly representative, shouldn't PS4 score a smidgen high for higher typical performance?

Perhaps if they banded titles into 5%/10% bands, it'd be more representative. But then you'd have, like exams, a stupid striation. Missed "Excellent" by one percent due to one reviewer having a bad week when they came to give a subjective score to your title? Then down to "Good" and no bonuses for anyone.

So perhaps it's in how the data is used that really matters? Like all stats, the numbers only have a limited value derived in how they are applied and interpreted. If dealt with sensibly, they should be representative of consensus, while users should realise that a 90% doesn't mean they'll like that game, and a 70% doesn't mean they'll dislike that game. Sadly, human beings tend to prefer basing their decisions on numbers than less obvious criteria (save maybe brand - Little Britain was crap but gained the UK number one slot), and offering them a score is, by and large, going to shift their thinking process.
 
I thought it was just a mean average of all review scores they found.

Nope. Different reviewers get different weights, and metacritic won't disclose the weights of each reviewer.
Which is why metacritic is good for universally well-made (or universally bad) games. Sure, if everyone agrees it's a good game, then it'll get a >85% on metacritic. If everyone agrees the game is bad, then it'll get a <65%.
However, it's an absolute crap for evaluating games that are in the 65-85% area... Which is where most of them end up falling on to.
 
That is bad then. Metacritic will be skewing the data and it's thus non-scientific.
Not necessarily. It may actually be fixing for bias. For example a game review from Maxim is no where to being as good as a review in Eurogamer. Therefore some outlets should have less weight as to not destroy the weight of reviews written by people who fit the job better.
 
It's still non-scientific because the data isn't pure. Scientific data should provide the raw data and the filtering method used, for checking and independent analysis. As it is, we just have to trust Metacritic are filtering out bias rather than filtering it in. And ultimately, reviews are the views of individuals and so there aren't any 'better' than any others. In fact, it can be argued that a gaming site is more dependent on appeasing the publishers than a truly independent review site and so are more likely to be biased in their reviews (more favourable towards publishers that fund their advertising) and require filtering their scores lower.
 
It's still non-scientific because the data isn't pure. Scientific data should provide the raw data and the filtering method used, for checking and independent analysis. As it is, we just have to trust Metacritic are filtering out bias rather than filtering it in. And ultimately, reviews are the views of individuals and so there aren't any 'better' than any others. In fact, it can be argued that a gaming site is more dependent on appeasing the publishers than a truly independent review site and so are more likely to be biased in their reviews (more favourable towards publishers that fund their advertising) and require filtering their scores lower.
What do you mean by "filtering out rather than filtering in"?

Of course there are better reviews than others. Some reviewers might spend hours for a good playthrough and try to put aside their personal preferences before the write a review, while others dont bother as much and because a game is "hard" they low score it. I have read reviews that made me think that the reviewer didnt understand the game or that he wasnt even a gamer, just some guy hired to do a fill in job just so the magazine or site will get some extra audiences.
What you describe of course in the bolded text does happen and I agree that is a different form of bias for which unfortunately we are unsure if there are any weights to eliminate it. Most probably not.
At this point we dont know how the weights work. Do they put lower weights on platform specific magazines and sites for example? Do they leave independent but credible sites untouched while reduce the weights on others?
 
It's still non-scientific because the data isn't pure. Scientific data should provide the raw data and the filtering method used, for checking and independent analysis.
Correcting for irregularities or bias in raw data is standard practice when reporting findings, the only thing amiss here is the lack of transparency about how they're doing this.
 
What do you mean by "filtering out rather than filtering in"?
If they weight reviews with a bias more readily, they'll be adding bias to their results. If the favoured reviewers are on the take, the positive scores, bought and paid for, would push the Metacritic up while the independent, more realistic (or at least, untainted) views would be devalued.

What you describe of course in the bolded text does happen and I agree that is a different form of bias for which unfortunately we are unsure if there are any weights to eliminate it. Most probably not.
Right. So the solution is to either limit one's sources to a few trusted sources, where every score is scored from the same set of reviewers all reviewing by the same personal criteria, or just use every number available. A large enough dataset will be representative of the wider public buying. A game might score 90% from three gaming sites representing the views of core gamers, but score 50% from 30 lifestyle sites representing the views of more casual gamers. A Metacritic showing 90%, limited to the gaming press, wouldn't represent the score that Average Joe might give the game.

Is Metacritic supposed to be getting the gamer-focussed score, or the general populace score? Same with movies - should the score be based on movie enthusiasts or the general movie-going public?

At this point we dont know how the weights work.
Which is why we can't trust the score. ;)

Correcting for irregularities or bias in raw data is standard practice when reporting findings, the only thing amiss here is the lack of transparency about how they're doing this.
Yeah, that's basically what I meant to say. We don't always get access to the raw data, although in the case of an automated web-aggregating system, there's no reason not to. And indeed, Metacritic lists the scores and I presume all their data points.
 
I'd be fascinated to to see their weightings and reasons for it. I (semi-)regularly read C&VG (RIP), EGDE, Eurogamer, GameInformer, IGN and VideoGamer and it's clear as day that EDGE and Eurogamer are far more discriminating with their 8, 9 and 10 scores compared to the other sites.
 
Reviewers (individuals) or review sites?
I don't know. They won't say.
All we know is that the score isn't a mean average (you can try that for yourself in any score).
Metacritic acknowledged that, but they won't disclose how they measure it.



Correcting for irregularities or bias in raw data is standard practice when reporting findings, the only thing amiss here is the lack of transparency about how they're doing this.
Exactly.
For a website that became so powerful and decisive for the success/failure of dev teams and publishers in a $80 billion dollar industry, I find this really troubling.



What do you mean by "filtering out rather than filtering in"?

Of course there are better reviews than others. Some reviewers might spend hours for a good playthrough and try to put aside their personal preferences before the write a review, while others dont bother as much and because a game is "hard" they low score it. I have read reviews that made me think that the reviewer didnt understand the game or that he wasnt even a gamer, just some guy hired to do a fill in job just so the magazine or site will get some extra audiences.
What you describe of course in the bolded text does happen and I agree that is a different form of bias for which unfortunately we are unsure if there are any weights to eliminate it. Most probably not.
At this point we dont know how the weights work. Do they put lower weights on platform specific magazines and sites for example? Do they leave independent but credible sites untouched while reduce the weights on others?

Yes, scores should be weighted. But who's weighting them? Who is evaluating the quality of each reviewer/site and what criteria are they using?
And why isn't it disclosed?

For all we know, they could turn a mediocre game into an 85% without anyone ever figuring up how and why that happened.
 
Not necessarily. It may actually be fixing for bias. For example a game review from Maxim is no where to being as good as a review in Eurogamer. Therefore some outlets should have less weight as to not destroy the weight of reviews written by people who fit the job better.

Sorry, but a Eurogamer review is as purely subjective as a Maxim review, but often pared with ignorance. This is true for all so-called professional game reviewers.

Didn't know that Metacrit is so bad and weighs reviews differently...
 
Can you imagine a world without critics?!
tumblr_ljkzov0Ncp1qgsrigo1_500.jpg
 
Can you imagine a world without critics?!
Without demos for everything, horrific. Who'd protect you from making lousy purchases of broken/rubbish software? Either we should have large enough demos that are perfectly representative, or we need the freedom for people to share their opinions and help others form an informed choice. Nothing wrong with that - it's just the monolithic score that's the issue.
 
Back
Top