Tom's Guide Forums
  Tom's Guide Forums » Graphic & Displays » Graphics Cards » [H] defending real gameplay vs benchmarks
 

Add a reply



 Word :   Username :  
 
Bottom
Author
 Thread : [H] defending real gameplay vs benchmarks
 
More Information

Many of you may know that Kyle and Brent over at [H] have a different way of testing video cards compared to most review sites. Also, they are quite vocal about defending their methods. Well it seems Crysis and the 3870x2 launch have caused them to explore this further to back up their views and reviews. Have a look:

http://enthusiast.hardocp.com/arti [...] VzaWFzdA==


---------------
MSI P6N SLI Platinum, Q6600, 2GB Crucial Ballistix Tracer PC8000,
SLI BFG 8800GT OC 512MB, SB X-Fi Fatality, Antec TruePower Trio 550W, Windows XP pro
Related Pr oduct
Register or log in to remove.

More Information

I understand thier methods, and at the same time I dont understand them. How can I when I myself cant reproduce them? Though any given game, certain demands will be applied to varying degrees, which will favor one card over another, so this in itself ruins their idea. My edit is to clarify. In any game, at certain points in that game will demand differing things from the gpu. In that scenario, 1 card will do good at say shaders if its good at doing that, as opposed to a different point in the same game where shaders arent so demanded. Hope this clears it up somewhat


Message edited by jaydeejohn on 02-11-2008 at 01:04:28 PM

---------------
Every artist is a cannibal,every poet is a thief,they all kill their inspiration then sing about their grief
More Information

One final note. They (H), have gone to great lengths to defend their testing, flawed tho it is. If they persist on defending using said methods, thats ok, just maybe next time, use something other than a beta driver, or take the lows out (like the .7 fps) in the article, that makes the card look bad, as well as the methods used. Oh, and use the LATEST drivers, the theyre beta as well.


---------------
Every artist is a cannibal,every poet is a thief,they all kill their inspiration then sing about their grief
More Information

I am glad they do things differently. I mean, it's just another tool in the consumers' tool box, and I am sure that difference is what keeps the site hits coming.

More Information

they have a good point.


---------------
Q6600 @ 3Ghz | zalman 9700NT cooler | gigabyte P35-DS3L | Kingstone DDR2 667 1GB x 2 | HIS 4850HD with Accelero S1 Rev.2 | enermax Liberty 500w | Coolermaster C5 case |
More Information

One thing, if we start going in this direction, then it comes down to how each person plays as well. Does he run thru? Scope out areas? Avoid certain conflict? To each his own was never better fit. And wouldnt each one bring a different conclusion? I like the same demands, same play, then Ill know what is what. If one uses metric, the other Miles instead of kilometers, and yet another something different, how can we measure a thing? And how can I reproduce these things? Should I just trust them? To play as I play? Go where I go in game? Too many holes in this, with no tools to measure with, and no way of reproducing this, thus rendering me with no comparisons. There has to be a better way than what theyre doing. I also wonder, why didnt they use the newest drivers from ATI for the 3870X2? Was it because it was way too demanding to start over? They had them, and only used them in their Crysis benches. Shoddy, incomplete, and not reproducible. Im glad we dont use that as a standard regaurding any scientific research


---------------
Every artist is a cannibal,every poet is a thief,they all kill their inspiration then sing about their grief
More Information

^good point.

they said they play the game. but they dont follow a single standard. if they say a scene that needs lots of rendering, then it is very subjective according to different games.

following a benched "canned" demo will help to make a standard. which in return reflects what the compared gpu can handle.

i think their problem is, in selecting the time frame they choose to study the frame rates. since it is very subjective


---------------
Q6600 @ 3Ghz | zalman 9700NT cooler | gigabyte P35-DS3L | Kingstone DDR2 667 1GB x 2 | HIS 4850HD with Accelero S1 Rev.2 | enermax Liberty 500w | Coolermaster C5 case |
More Information

their whole point about the optimizations with canned benchmarks is kind of moot... considering you can turn off the optimizations very easily


Message edited by skittle on 02-11-2008 at 03:34:08 PM

---------------
macgirlfriend:
"Hey I don't get you people, the people on insanely mac were so much nicer"
More Information

I like the way HardOCP benchmarks video cards. And, I would like to know how a card really performs while playing a game. If it is going to get into the unplayable range, I would like to know this. Keep it up IMO.

hacking your computer
More Information

It's hardocp's opinion that redeems what they think is playable. No one else's. It is nothing but their opinion.

 

This kind of tests does show something but not as much as running apple to apple tests.


Message edited by marvelous2 11 on 02-11-2008 at 06:25:55 PM

---------------
Asus P5B vanilla with E6300 B2 stepping @ 3.5
3 gigs Micron D9
EVGA 8800gs 729/1728/1044
http://www.tomshardware.com/forum/ [...] ew-benches
More Information

I think I will be looking back in there now and again for reviews. It is a different twist, granted it is not ‘scientific’, there results will most likely change every time they test a card/game but it is more information then the same ol’

If I were buying an expensive card having the extras info might be something to consider, especially if I were buying specifically for a game like Crysis… Great the card owns in 3Dmark but how does it do in the only app I care about?

At the end of the day there [H] results would have to be taken as subjective.


---------------
Striker Extreme | Q6600 @ 3 GHZ| 8800GTX | 4GIG DDR2-800 | 1000W PSU | Raptor 150GB | 2*Western Digital 300GB | Water cooled.
More Information

I honestly despise Hardocp's gaming benchmarks. It's a perfect example of how not to benchmark. Why compare cards running at different settings with all this "Max playable" and "lowest playable". It takes what, 5-15 minutes to figure out how well your system can play a game? So you fine tune it yourself to base your pc's performance ability. But running a benchmark which is suppose to compare apples to apples, but instead running it for apples to oranges? It's pointless, the point is to compare different hardware so we get a better understanding, not force us to try to compensate based on the different settings. We all know every architecture has strengths and flaws..Some being better at AA than others (ehem, r6xx)....This is honestly a flawed review system, that should be dropped. But we know that aint gonna happen ^_^.

My ass does all my talking!
More Information

I really don't understand why people oppose HardOCP's gpu benchmarking methodology. C'mon even the EPA recognized that using the estimated miles per gallon formula was misleading and updated their mileage estimates to include real-world driving tests.

If the friggin government can recognize the value of performing real-world tests, what's stopping a bunch of hardware geeks from doing the same? Given that it was proven nVidia and ATI tweaked drivers to perform better than real-world when running benchmarks, why wouldn't hardware geeks want a real-life comaraison?

At the very least, HardOCP's methodology offers an expert opinion on the quality of gameplay an everyday gamer can expect. Kudos to HardOCP!

Message quoted 1 times
Message edited by chunkymons ter on 02-11-2008 at 07:57:21 PM

---------------
Candy asked me if she died if I could go on
Of course I said I couldn't and of course we knew that's wrong
But Candy I said Candy no you can't do that to me
Because you love me way to much for you to ever leave
hacking your computer
More Information

What makes you think Hardocp opinions are same as yours? It isn't. What they think might deem playable for some might not be playable for others. It might even even be overkill for some.

Hardocp should put up regular canned benches as well as what they think.


---------------
Asus P5B vanilla with E6300 B2 stepping @ 3.5
3 gigs Micron D9
EVGA 8800gs 729/1728/1044
http://www.tomshardware.com/forum/ [...] ew-benches
More Information

It would be nice if they did the following:

Crysis
1) The top 10 video cards on the same Intel machine running exact same settings -->maxed everything
2) The top 10 video cards on the same AMD machine running exact same settings -->maxed everything

Rinse, repeat with the top 100 games :)


---------------
Striker Extreme | Q6600 @ 3 GHZ| 8800GTX | 4GIG DDR2-800 | 1000W PSU | Raptor 150GB | 2*Western Digital 300GB | Water cooled.
More Information

chunkymonster: You really can't compare what you just stated, it's apples and oranges all over again! =P.

When it comes to benchmarks, when identical settings/setups are uses, you can actually tell the difference between the cards, this isn't a skewed MPG system that are used with automotive vehicles, we have actual proof on hand....HARDOCP's methodology for benchmarking GPU's is doing exactly what you claimed they are trying not to do, skewing the benchmarks, and making it harder for the consumer/public to compare a product.

VERY bad benchmarking.

More Information

First of all i like this idea of evaluating performance for particular games.
I think most everyone is aware that often times drivers/hardware are optimized to take advantage of benchmarks. Although I wish they would just do level run throughs at high/med/low res/settings and do away with the "playable" thing. The other thing is the reader is forced to assume there is no company bias. i.e. on one card runs through the level looking at the ground as much as possible and on the other card stares directly at explosions/effects throughout the level. Now if they want to get extensive, testing two cards, running through each level 3-4 times and providing min- max -avg for each run through for each card, then and overall for each level for each card. Then an overall for the entire game. Of course by unbiased testers. I think that would paint a pretty accurate picture. For those of you defending Synthetics they can be easily manipulated, the equivilent to me getting a copy of an upcoming exam before hand and scoring well on a test. My results on said test does not reflect my real world knowledge of the topic. Had the exam changed prior to my taking it, a much more realistic protrayal of my knowledge would have been recorded.

I mean its up to you, if card A scored better on a benchmark synthetic test than card B and card B performed better in rigorous real world application than card A, I guess it is up to you what you would like better, good on paper or good in application.

I would prefer repeated real world testing (i.e. running the same level multiple times on each card and recording the values for each run)
I mean come on, how can you argue against real world testing? if I play a game on a card then ran said games benchmark at LFPS 30 HFPS 80 and AVG 50, then when I actually play the game I get LFPS 15 HFPS 50 and AVG 30, thats a little misleading no? Especially if a card that benchmarked lower actually does better real world.

Not even the benchmarks are 100 percent accurate every time, run a bench mark 3 times in a row and tell me if you get the same score each time?

Barring somebody purposely influencing the the real world test (i.e. exploiting high frame situations looking at the ground/sky etc. and doing it for an extended period of time) I think you are going to get pretty Accurate) information.

Also, in my opinion why isn't this scientific? Say you are on a pool table and hit a cue ball into the 8 ball 3 times, each time from the same distance, force etc. and record the direction, speed, and distance in which the 8 ball stops after being hit. The, you use a computer physics simulation to do the same thing. How is the aforementioned not scientific? its the same situation as testing real world game performance to a synthetic benchmark. Of course the real world application wont be perfect every time, but the REAL WORLD ISNT PERFECT and what works in a simulation may prove not as good in actual application. The Key is repetition, and use averages. If the reviewers played through a level 10 times, and handed you an average low average high and general average over all ten instances, would you still not trust it over the synthetic?!

Sorry for the long rant, I just cannot understand why anyone would prefer a synthetic benchmark to rigorous real world testing.

Again, I am not a big fan of the way they do it, but I would prefer real world benchmarks to synthetics.

Message quoted 1 times
Message edited by tsd16 on 02-11-2008 at 09:25:47 PM
More Information

tsd16 wrote :

Also, in my opinion why isn't this scientific? Say you are on a pool table and hit a cue ball into the 8 ball 3 times, each time from the same distance, force etc. and record the direction, speed, and distance in which the 8 ball stops after being hit. The, you use a computer physics simulation to do the same thing. How is the aforementioned not scientific?



It is not scientific because they cant reproduce the same result three times in a row, granted they will get close but they will not have scientific answers.
I for one would be happy with the average of three, as long as we can see all three results, and they are close.

**edit**
There are too many variables in this type of testing to be scientific


Message edited by grieve on 02-11-2008 at 10:00:55 PM

---------------
Striker Extreme | Q6600 @ 3 GHZ| 8800GTX | 4GIG DDR2-800 | 1000W PSU | Raptor 150GB | 2*Western Digital 300GB | Water cooled.
Master-de-bater
More Information
n°1788795
02-11-2008 at 10:43:16 PM