Warning: Table './devblogsdb/cache_page' is marked as crashed and last (automatic?) repair failed query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'http://www.softdevblogs.com/?q=aggregator/sources/8' in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc on line 135

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 729

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 730

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 731

Warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/bootstrap.inc on line 732
Software Development Blogs: Programming, Software Testing, Agile, Project Management
Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Coding Horror - Jeff Atwood
warning: Cannot modify header information - headers already sent by (output started at /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/database.mysql.inc:135) in /home/content/O/c/n/Ocnarfparking9/html/softdevblogs/includes/common.inc on line 153.
Syndicate content
programming and human factors
Updated: 8 hours 14 min ago

To ECC or Not To ECC

Fri, 11/20/2015 - 00:44

On one of my visits to the Computer History Museum – and by the way this is an absolute must-visit place if you are ever in the San Francisco bay area – I saw an early Google server rack circa 1999 in the exhibits.

Not too fancy, right? Maybe even … a little janky? This is building a computer the Google way:

Instead of buying whatever pre-built rack-mount servers Dell, Compaq, and IBM were selling at the time, Google opted to hand-build their server infrastructure themselves. The sagging motherboards and hard drives are literally propped in place on handmade plywood platforms. The power switches are crudely mounted in front, the network cables draped along each side. The poorly routed power connectors snake their way back to generic PC power supplies in the rear.

Some people might look at these early Google servers and see an amateurish fire hazard. Not me. I see a prescient understanding of how inexpensive commodity hardware would shape today's internet. I felt right at home when I saw this server; it's exactly what I would have done in the same circumstances. This rack is a perfect example of the commodity x86 market D.I.Y. ethic at work: if you want it done right, and done inexpensively, you build it yourself.

This rack is now immortalized in the National Museum of American History. Urs Hölzle posted lots more juicy behind the scenes details, including the exact specifications:

  • Supermicro P6SMB motherboard
  • 256MB PC100 memory
  • Pentium II 400 CPU
  • IBM Deskstar 22GB hard drives (×2)
  • Intel 10/100 network card

When I left Stack Exchange (sorry, Stack Overflow) one of the things that excited me most was embarking on a new project using 100% open source tools. That project is, of course, Discourse.

Inspired by Google and their use of cheap, commodity x86 hardware to scale on top of the open source Linux OS, I also built our own servers. When I get stressed out, when I feel the world weighing heavy on my shoulders and I don't know where to turn … I build servers. It's therapeutic.

I like to give servers a little pep talk while I build them. "Who's the best server! Who's the fastest server!"

— Jeff Atwood (@codinghorror) November 16, 2015

Don't judge me, man.

But more seriously, with the release of Intel's latest Skylake architecture, it's finally time to upgrade our 2013 era Discourse servers to the latest and greatest, something reflective of 2016 – which means building even more servers.

Discourse runs on a Ruby stack and one thing we learned early on is that Ruby demands exceptional single threaded performance, aka, a CPU running as fast as possible. Throwing umptazillion CPU cores at Ruby doesn't buy you a whole lot other than being able to handle more requests at the same time. Which is nice, but doesn't get you speed per se. Someone made a helpful technical video to illustrate exactly how this all works:

This is by no means exclusive to Ruby; other languages like JavaScript and Python also share this trait. And Discourse itself is a JavaScript application delivered through the browser, which exercises the mobile / laptop / desktop client CPU. Mobile devices reaching near-parity with desktop performance in single threaded performance is something we're betting on in a big way with Discourse.

So, good news! Although PC performance has been incremental at best in the last 5 years, between Haswell and Skylake, Intel managed to deliver a respectable per-thread performance bump. Since we are upgrading our servers from Ivy Bridge (very similar to the i7-3770k), the generation before Haswell, I'd expect a solid 33% performance improvement at minimum.

Even worse, the more cores they pack on a single chip, the slower they all go. From Intel's current Xeon E5 lineup:

  • E5-1680 → 8 cores, 3.2 Ghz
  • E5-1650 → 6 cores, 3.5 Ghz
  • E5-1630 → 4 cores, 3.7 Ghz

Sad, isn't it? Which brings me to the following build for our core web tiers, which optimizes for "lots of inexpensive, fast boxes"

2013 2016 Xeon E3-1280 V2 Ivy Bridge 3.6 Ghz / 4.0 Ghz quad-core ($640)
SuperMicro X9SCM-F-O mobo ($190)
32 GB DDR3-1600 ECC ($292)
SC111LT-330CB 1U chassis ($200)
Samsung 830 512GB SSD ×2 ($1080)
1U Heatsink ($25)
i7-6700k Skylake 4.0 Ghz / 4.2 Ghz quad-core ($370)
SuperMicro X11SSZ-QF-O mobo ($230)
64 GB DDR4-2133 ($520)
CSE-111LT-330CB 1U chassis ($215)
Samsung 850 Pro 1TB SSD ×2 ($886)
1U Heatsink ($20)
$2,427 $2,241 31w idle, 87w BurnP6 load 14w idle, 81w BurnP6 load

So, about 10% cheaper than what we spent in 2013, with 2× the memory, 2× the storage (probably 50-100% faster too), and at least ~33% faster CPU. With lower power draw, to boot! Pretty good. Pretty, pretty, pretty, pretty good.

(Note that the memory bump is only possible thanks to Intel finally relaxing their iron fist of maximum allowed RAM at the low end; that's new to the Skylake generation.)

One thing is conspicuously missing in our 2016 build: Xeons, and ECC Ram. In my defense, this isn't intentional – we wanted the fastest per-thread performance and no Intel Xeon, either currently available or announced, goes to 4.0 GHz with Skylake. Paying half the price for a CPU with better per-thread performance than any Xeon, well, I'm not going to kid you, that's kind of a nice perk too.

So what is ECC all about?

Error-correcting code memory (ECC memory) is a type of computer data storage that can detect and correct the most common kinds of internal data corruption. ECC memory is used in most computers where data corruption cannot be tolerated under any circumstances, such as for scientific or financial computing.

Typically, ECC memory maintains a memory system immune to single-bit errors: the data that is read from each word is always the same as the data that had been written to it, even if one or more bits actually stored have been flipped to the wrong state. Most non-ECC memory cannot detect errors although some non-ECC memory with parity support allows detection but not correction.

It's received wisdom in the sysadmin community that you always build servers with ECC RAM because, well, you build servers to be reliable, right? Why would anyone intentionally build a server that isn't reliable? Are you crazy, man? Well, looking at that cobbled together Google 1999 server rack, which also utterly lacked any form of ECC RAM, I'm inclined to think that reliability measured by "lots of redundant boxes" is more worthwhile and easier to achieve than the platonic ideal of making every individual server bulletproof.

Being the type of guy who likes to question stuff… I began to question. Why is it that ECC is so essential anyway? If ECC was so important, so critical to the reliable function of computers, why isn't it built in to every desktop, laptop, and smartphone in the world by now? Why is it optional? This smells awfully… enterprisey to me.

Now, before everyone stops reading and I get permanently branded as "that crazy guy who hates ECC", I think ECC RAM is fine:

  • The cost difference between ECC and not-ECC is minimal these days.
  • The performance difference between ECC and not-ECC is minimal these days.
  • Even if ECC only protects you from rare 1% hardware error cases that you may never hit until you literally build hundreds or thousands of servers, it's cheap insurance.

I am not anti-insurance, nor am I anti-ECC. But I do seriously question whether ECC is as operationally critical as we have been led to believe, and I think the data shows modern, non-ECC RAM is already extremely reliable.

First, let's look at the Puget Systems reliability stats. These guys build lots of commodity x86 gamer PCs, burn them in, and ship them. They helpfully track statistics on how many parts fail either from burn-in or later in customer use. Go ahead and read through the stats.

For the last two years, CPU reliability has dramatically improved. What is interesting is that this lines up with the launch of the Intel Haswell CPUs which was when the CPU voltage regulation was moved from the motherboard to the CPU itself. At the time we theorized that this should raise CPU failure rates (since there are more components on the CPU to break) but the data shows that it has actually increased reliability instead.

Even though DDR4 is very new, reliability so far has been excellent. Where DDR3 desktop RAM had an overall failure rate in 2014 of ~0.6%, DDR4 desktop RAM had absolutely no failures.

SSD reliability has dramatically improved recently. This year Samsung and Intel SSDs only had a 0.2% overall failure rate compared to 0.8% in 2013.

Modern commodity computer parts from reputable vendors are amazingly reliable. And their trends show from 2012 onward essential PC parts have gotten more reliable, not less. (I can also vouch for the improvement in SSD reliability as we have had zero server SSD failures in 3 years across our 12 servers with 24+ drives, whereas in 2011 I was writing about the Hot/Crazy SSD Scale.) And doesn't this make sense from a financial standpoint? How does it benefit you as a company to ship unreliable parts? That's money right out of your pocket and the reseller's pocket, plus time spent dealing with returns.

We had a, uh, "spirited" discussion about this internally on our private Discourse instance.

This is not a new debate by any means, but I was frustrated by the lack of data out there. In particular, I'm really questioning the difference between "soft" and "hard" memory errors:

But what is the nature of those errors? Are they soft errors – as is commonly believed – where a stray Alpha particle flips a bit? Or are they hard errors, where a bit gets stuck?

I absolutely believe that hard errors are reasonably common. RAM DIMMS can have bugs, or the chips on the DIMM can fail, or there's a design flaw in circuitry on the DIMM that only manifests in certain corner cases or under extreme loads. I've seen it plenty. But a soft error where a bit of memory randomly flips?

There are two types of soft errors, chip-level soft error and system-level soft error. Chip-level soft errors occur when the radioactive atoms in the chip's material decay and release alpha particles into the chip. Because an alpha particle contains a positive charge and kinetic energy, the particle can hit a memory cell and cause the cell to change state to a different value. The atomic reaction is so tiny that it does not damage the actual structure of the chip.

Outside of airplanes and spacecraft, I have a difficult time believing that soft errors happen with any frequency, otherwise most of the computing devices on the planet would be crashing left and right. I deeply distrust the anecdotal voodoo behind "but one of your computer's memory bits could flip, you'd never know, and corrupted data would be written!" It'd be one thing if we observed this regularly, but I've been unhealthily obsessed with computers since birth and I have never found random memory corruption to be a real, actual problem on any computers I have either owned or had access to.

But who gives a damn what I think. What does the data say?

A 2007 study found that the observed soft error rate in live servers was two orders of magnitude lower than previously predicted:

Our preliminary result suggests that the memory soft error rate in two real production systems (a rack-mounted server environment and a desktop PC environment) is much lower than what the previous studies concluded. Particularly in the server environment, with high probability, the soft error rate is at least two orders of magnitude lower than those reported previously. We discuss several potential causes for this result.

A 2009 study on Google's server farm notes that soft errors were difficult to find:

We provide strong evidence that memory errors are dominated by hard errors, rather than soft errors, which previous work suspects to be the dominant error mode.

Yet another large scale study from 2012 discovered that RAM errors were dominated by permanent failure modes typical of hard errors:

Our study has several main findings. First, we find that approximately 70% of DRAM faults are recurring (e.g., permanent) faults, while only 30% are transient faults. Second, we find that large multi-bit faults, such as faults that affects an entire row, column, or bank, constitute over 40% of all DRAM faults. Third, we find that almost 5% of DRAM failures affect board-level circuitry such as data (DQ) or strobe (DQS) wires. Finally, we find that chipkill functionality reduced the system failure rate from DRAM faults by 36x.

In the end, we decided the non-ECC RAM risk was acceptable for every tier of service except our databases. Which is kind of a bummer since higher end Skylake Xeons got pushed back to the big Purley platform upgrade in 2017. Regardless, we burn in every server we build with a complete run of memtestx86 and overnight prime95/mprime, and you should too. There's one whirring away through endless memory tests right behind me as I write this.

I find it very, very suspicious that ECC – if it is so critical to preventing these random, memory corrupting bit flips – has not already been built into every type of RAM that we ship in the ubiquitous computing devices all around the world as a cost of doing business. But I am by no means opposed to paying a small insurance premium for server farms, either. You'll have to look at the data and decide for yourself. Mostly I wanted to collect all this information in one place so people who are also evaluating the cost/benefit of ECC RAM for themselves can read the studies and decide what they want to do.

Please feel free to leave comments if you have other studies to cite, or significant measured data to share.

[advertisement] At Stack Overflow, we put developers first. We already help you find answers to your tough coding questions; now let us help you find your next job.
Categories: Programming

Building a PC, Part VIII: Iterating

Thu, 09/17/2015 - 23:55

The last time I seriously upgraded my PC was in 2011, because the PC is over. And in some ways, it truly is – they can slap a ton more CPU cores on a die, for sure, but the overall single core performance increase from a 2011 high end Intel CPU to today's high end Intel CPU is … really quite modest, on the order of maybe 30% to 40%.

In that same timespan, mobile and tablet CPU performance has continued to just about double every year. Which means the forthcoming iPhone 6s will be almost 10 times faster than the iPhone 4 was.

iPhone single core geekbench results

Remember, that's only single core CPU performance – I'm not even factoring in the move from single, to dual, to triple core as well as generally faster memory and storage. This stuff is old hat on desktop, where we've had mainstream dual cores for a decade now, but they are huge improvements for mobile.

When your mobile devices get 10 times faster in the span of four years, it's hard to muster much enthusiasm for a modest 1.3 × or 1.4 × iterative improvement in your PC's performance over the same time.

I've been slogging away at this for a while; my current PC build series spans 7 years:

The fun part of building a PC is that it's relatively easy to swap out the guts when something compelling comes along. CPU performance improvements may be modest these days, but there are still bright spots where performance is increasing more dramatically. Mainly in graphics hardware and, in this case, storage.

The current latest-and-greatest Intel CPU is Skylake. Like Sandy Bridge in 2011, which brought us much faster 6 Gbps SSD-friendly drive connectors (although only two of them), the Skylake platform brings us another key storage improvement – the ability to connect hard drives directly to the PCI Express lanes. Which looks like this:

… and performs like this:

Now there's the 3× performance increase we've been itching for! To be fair, a raw increase of 3× in drive performance doesn't necessarily equate to a computer that boots in one third the time. But here's why disk speed matters:

If the CPU registers are how long it takes you to fetch data from your brain, then going to disk is the equivalent of fetching data from Pluto.

What I've always loved about SSDs is that they attack the PC's worst-case performance scenario, when information has to come off the slowest device inside your computer – the hard drive. SSDs massively reduced the variability of requests for data. Let's compare L1 cache access time to minimum disk access time:

Traditional hard drive
0.9 ns → 10 ms (variability of 11,111,111× )

0.9 ns → 150 µs (variability of 166,667× )

SSDs provide a reduction in overall performance variability of 66×! And when comparing latency:

7200rpm HDD — 1800ms
SATA SSD — 4ms
PCIe SSD — 0.34ms

Even going from a fast SATA SSD to a PCI Express SSD, you're looking at a 10x reduction in drive latency.

Here's what you need:

These are the basics. It's best to use the M.2 connection as a fast boot / system drive, so I scaled it back to the smaller 256 GB version. I also had a lot of trouble getting my hands on the faster i7-6700k CPU, which appears supply constrained and is currently overpriced as a result.

(Also, be careful, as some older M.2 drives can use the older ACPI connection type. Make sure yours is NVMe.)

Even though the days of doubling (or even 1.5×-ing) CPU performance are long gone for PCs, there are still some key iterative performance milestones to hit. Like mainstream 4k displays, I believe mainstream PCI express SSDs are another important step in the overall evolution of desktop computing. Or its corpse, anyway.

[advertisement] Find a better job the Stack Overflow way - what you need when you need it, no spam, and no scams.
Categories: Programming

Our Brave New World of 4K Displays

Tue, 08/18/2015 - 10:39

It's been three years since I last upgraded monitors. Those inexpensive Korean 27" IPS panels, with a resolution of 2560×1440 – also known as 1440p – have served me well. You have no idea how many people I've witnessed being Wrong On The Internet on these babies.

I recently got the upgrade itch real bad:

  • 4K monitors have stabilized as a category, from super bleeding edge "I'm probably going to regret buying this" early adopter stuff, and beginning to approach mainstream maturity.

  • Windows 10, with its promise of better high DPI handling, was released. I know, I know, we've been promised reasonable DPI handling in Windows for the last five years, but hope springs eternal. This time will be different!™

  • I needed a reason to buy a new high end video card, which I was also itching to upgrade, and simplify from a dual card config back to a (very powerful) single card config.

  • I wanted to rid myself of the monitor power bricks and USB powered DVI to DisplayPort converters that those Korean monitors required. I covet simple, modern DisplayPort connectors. I was beginning to feel like a bad person because I had never even owned a display that had a DisplayPort connector. First world problems, man.

  • 1440p at 27" is decent, but it's also … sort of an awkward no-man's land. Nowhere near high enough resolution to be retina, but it is high enough that you probably want to scale things a bit. After living with this for a few years, I think it's better to just suck it up and deal with giant pixels (34" at 1440p, say), or go with something much more high resolution and trust that everyone is getting their collective act together by now on software support for high DPI.

Given my great experiences with modern high DPI smartphone and tablet displays (are there any other kind these days?), I want those same beautiful high resolution displays on my desktop, too. I'm good enough, I'm smart enough, and doggone it, people like me.

I was excited, then, to discover some strong recommendations for the Asus PB279Q.

The Asus PB279Q is a 27" panel, same size as my previous cheap Korean IPS monitors, but it is more premium in every regard:

  • 3840×2160
  • "professional grade" color reproduction
  • thinner bezel
  • lighter weight
  • semi-matte (not super glossy)
  • integrated power (no external power brick)
  • DisplayPort 1.2 and HDMI 1.4 support built in

It is also a more premium monitor in price, at around $700, whereas I got my super-cheap no-frills Korean IPS 1440p monitors for roughly half that price. But when I say no-frills, I mean it – these Korean monitors didn't even have on-screen controls!

4K is a surprisingly big bump in resolution over 1440p — we go from 3.7 to 8.3 megapixels.

But, is it … retina?

It depends how you define that term, and from what distance you're viewing the screen. Per Is This Retina:

27" 3840×2160 'retina' at a viewing distance of 21" 27" 2560×1440 'retina' at a viewing distance of 32"

With proper computer desk ergonomics you should be sitting with the top of your monitor at eye level, at about an arm's length in front of you. I just measured my arm and, fully extended, it's about 26". Sitting at my desk, I'm probably about that distance from my monitor or a bit closer, but certainly beyond the 21" necessary to call this monitor 'retina' despite being 163 PPI. It definitely looks that way to my eye.

I have more words to write here, but let's cut to the chase for the impatient and the TL;DR crowd. This 4K monitor is totally amazing and you should buy one. It feels exactly like going from the non-retina iPad 2 to the retina iPad 3 did, except on the desktop. It makes all the text on your screen look beautiful. There is almost no downside.

There are a few caveats, though:

  • You will need a beefy video card to drive a 4K monitor. I personally went all out for the GeForce 980 Ti, because I might want to actually game at this native resolution, and the 980 Ti is the undisputed fastest single video card in the world at the moment. If you're not a gamer, any midrange video card should do fine.

  • Display scaling is definitely still a problem at times with a 4K monitor. You will run into apps that don't respect DPI settings and end up magnifying-glass tiny. Scott Hanselman provided many examples in January 2014, and although stuff has improved since then with Windows 10, it's far from perfect.

    Browsers scale great, and the OS does too, but if you use any desktop apps built by careless developers, you'll run into this. The only good long term solution is to spread the gospel of 4K and shame them into submission with me. Preach it, brothers and sisters!

  • Enable DisplayPort 1.2 in the monitor settings so you can turn on 60Hz. Trust me, you do not want to experience a 30Hz LCD display. It is unspeakably bad, enough to put one off computer screens forever. For people who tell you they can't see the difference between 30fps and 60fps, just switch their monitors to 30hz and watch them squirm in pain.

    Viewing those comparison videos, I begin to understand why gamers want 90Hz, 120Hz or even 144Hz monitors. 60fps / 60 Hz should be the absolute minimum, no matter what resolution you're running. Luckily DisplayPort 1.2 enables 60 Hz at 4K, but only just. You'll need DisplayPort 1.3+ to do better than that.

  • Disable the crappy built in monitor speakers. Headphones or bust, baby!

  • Turn down the brightness from the standard factory default of retina scorching 100% to something saner like 50%. Why do manufacturers do this? Is it because they hate eyeballs? While you're there, you might mess around with some basic display calibration, too.

This Asus PB279Q 4K monitor is the best thing I've upgraded on my computer in years. Well, actually, thing(s) I've upgraded, because I am not f**ing around over here.

Flo monitor arms, front view, triple monitors

I'm a long time proponent of the triple monitor lifestyle, and the only thing better than a 4K display is three 4K displays! That's 11,520×2,160 pixels to you, or 6,480×3,840 if rotated.

(Good luck attempting to game on this configuration with all three monitors active, though. You're gonna need it. Some newer games are too demanding to run on "High" settings on a single 4K monitor, even with the mighty Nvidia 980 Ti.)

I've also been experimenting with better LCD monitor arms that properly support my preferred triple monitor configurations. Here's a picture from the back, where all the action is:

Flo monitor arms, triple monitors, rear view

These are the Flo Monitor Supports, and they free up a ton of desk space in a triple monitor configuration while also looking quite snazzy. I'm fond of putting my keyboard just under the center monitor, which isn't possible with any monitor stand.

Flo monitor arm suggested multi-monitor setups

With these Flo arms you can "scale up" your configuration from dual to triple or even quad (!) monitor later.

4K monitors are here, they're not that expensive, the desktop operating systems and video hardware are in place to properly support them, and in the appropriate size (27") we can finally have an amazing retina display experience at typical desktop viewing distances. Choose the Asus PB279Q 4K monitor, or whatever 4K monitor you prefer, but take the plunge.

In 2007, I asked Where Are The High Resolution Displays, and now, 8 years later, they've finally, finally arrived on my desktop. Praise the lord and pass the pixels!

Oh, and gird your loins for 8K one day. It, too, is coming.

[advertisement] Building out your tech team? Stack Overflow Careers helps you hire from the largest community for programmers on the planet. We built our site with developers like you in mind.
Categories: Programming