Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
Syndicate content
Updated: 13 hours 48 min ago

Gone Fishin' 2014

Fri, 05/23/2014 - 16:56

Well, not exactly Fishin', but I'll be on a month long vacation starting today.

I won't be posting new content, so we'll all have a break. Disappointing, I know.

If you've ever wanted to write an article for HighScalability this would be a great time :-) I'd be very interested in your experiences with containers vs VMs if you have some thoughts on the subject.

So if the spirit moves you, please write something.

See you on down the road...

Categories: Architecture

9 Principles of High Performance Programs

Wed, 05/21/2014 - 16:54

Arvid Norberg on the libtorrent blog has put together an excellent list of principles of high performance programs, obviously derived from hard won experience programming on bittorrent:

Two fundamental causes of performance problems:

  1. Memory Latency. A big performance problem on modern computers is the latency of SDRAM. The CPU waits idle for a read from memory to come back.
  2. Context Switching. When a CPU switches context "the memory it will access is most likely unrelated to the memory the previous context was accessing. This often results in significant eviction of the previous cache, and requires the switched-to context to load much of its data from RAM, which is slow."

Rules to help balance the forces of evil:

Categories: Architecture

It's Networking. In Space! Or How E.T. Will Phone Home.

Tue, 05/20/2014 - 16:56

What will the version of the Internet that follows us to the stars look like? Yes, people are really thinking seriously about this sort of thing. Specifically the InterPlanetary Networking Special Interest Group (IPNSIG).

Ansible-like faster-than-light communication it isn't. There's no magical warp drive. Nor is a network of telepaths acting as a 'verse spanning telegraph system.

It's more mundane than that. And in many ways more interesting as it's sort of like the old Internet on steroids, the one that was based on on UUCP and dial-up connections, but over vastly longer distances and with much longer delays:

Categories: Architecture

A Short On How the Wayback Machine Stores More Pages than Stars in the Milky Way

Mon, 05/19/2014 - 16:56

How does the Wayback Machine work? Now with over 400 billion webpages indexed, allowing the Internet to be browsed all the way back to 1996, it's an even more compelling question. I've looked several times but I've never found a really good answer.

Here's some information from a thread on Hacker News. It starts with mmagin, a former Archive employee:

Categories: Architecture

Stuff The Internet Says On Scalability For May 16th, 2014

Fri, 05/16/2014 - 17:10

Hey, it's HighScalability time:


Cross Section of an Undersea Cable. It's industrial art. The parts. The story.
  • 400,000,000,000: Wayback Machine pages indexed; 100 billion: Google searches per month; 10 million: Snapchat monthly user growth.
  • Quotable Quotes:
    • @duncanjw: The Great Rewrite - many apps will be rewritten not just replatformed over next 10 years says @cote #openstacksummit
    • @RFFlores: The Openstack conundrum. If you don't adopt it, you will regret it in the future. If you do adopt it, you will regret it now
    • elementai: I love Redis so much, it became like a superglue where "just enough" performance is needed to resolve a bottleneck problem, but you don't have resources to rewrite a whole thing in something fast.
    • @antirez: "when software engineering is reduced to plumbing together generic systems, software engineers lose their sense of ownership"
    • Tom Akehurst: Microservices vs. monolith is a false dichotomy.
    • @joestump: “Keep in mind that any piece of butt-based infrastructure can fail at any time. Plan your infrastructure accordingly.” Ain’t that the truth?
    • @SalesforceEng: Check out the scale of Kafka @LinkedInEng. @bonkoif says these numbers are about a month old. 3.25 million msgs/sec. 
    • Don Neufeld: The first is to look deeply into the stack of implicit assumptions I’m working with. It’s often the unspoken assumptions that are the most important ones. The second flows from the first and it’s to focus less on building the right thing and more how we’re going to meet our immediate needs.
    • Dan Gillmor: We’re in danger of losing what’s made the Internet the most important medium in history – a decentralized platform where the people at the edges of the networks – that would be you and me – don’t need permission to communicate, create and innovate.

  • If you think of a Hotel as an app, hotels have been doing in-app purchases for a long time. They lead with a teaser rate and then charge for anything that might cross a desire-money threshold. Wifi, that's extra. Gym, that's extra. The bar, a cover charge. Drinks, so so expensive. The pool, extra. A lounge by the pool is double extra extra. To go all the way hotels just need to let you stay for free and then fully monetize all the gamification points.

  • Apple: We handle hundreds of millions of active users using some of the most desirable devices on the planet and several Billion iMesssages/day, 40 billion push notifications/day, 16+ trillion push notifications sent to date.

  • It's a data prison for everyone! Comcast plans data caps for all customers in 5 years, could be 500GB. Or just a few 4K movies.

  • From the future of everything to the verge of extinction. The Slow Decline of Peer-to-Peer File Sharing: People have shifted their activities to streaming over file sharing. Subscribers get quality content at a reasonable price and it's dead simple to use, whereas torrenting or file sharing is a little more complicated.

  • I don't think people understand how hard this is to do in practice. European Court Lets Users Erase Records on Web. Once data is stored on tape deleting takes rewriting all the non-deleted data to another tape. So it's far more efficient to forget indexes to data than delete the data. Which goes against the point I'd imagine.

  • How is a strategy tax hands off? @parislemon: Instagram's decision to use Facebook's much worse place database over Foursquare's has made the product worse. Stupid.

  • Excellent detailed example of the SEDA architecture in action. Guide to Cassandra Thread Pools. Follow the regal message as it flows from thread pool to thread pool, transforming as it makes its way to its final resting place.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Categories: Architecture

Paper: SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine

Thu, 05/15/2014 - 17:05

So how do you knit multiple datacenters and many thousands of phones and other clients into a single cooperating system?

Usually you don't. It's too hard. We see nascent attempts in services like Firebase and Parse. 

SwiftCloud, as described in SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine, goes two steps further, by leveraging Conflict free Replicated Data Types (CRDTs), which means "data can be replicated at multiple sites and be updated independently with the guarantee that all replicas converge to the same value. In a cloud environment, this allows a user to access the data center closer to the user, thus optimizing the latency for all users."

While we don't see these kind of systems just yet, they are a strong candidate for how things will work in the future, efficiently using resources at every level while supporting huge numbers of cooperating users.

Abstract:

Categories: Architecture

Google Says Cloud Prices Will Follow Moore’s Law: Are We All Renters Now?

Wed, 05/14/2014 - 16:56

After Google cut prices on their Google Cloud Platform Amazon quickly followed with their own price cuts. Even more interesting is what the future holds for pricing. The near future looks great. After that? We'll see.

Adrian Cockcroft highlights that Google thinks prices should follow Moore’s law, which means we should expect prices to halve every 18-24 months.

That's good news. Greater cost certainty means you can make much more aggressive build out plans. With the savings you can hire more people, handle more customers, and add those media rich features you thought you couldn't afford. Design is directly related to costs.

Without Google competing with Amazon there's little doubt the price reduction curve would be much less favorable.

As a late cloud entrant Google is now in a customer acquisition phase, so they are willing to pay for customers, which means lower prices are an acceptable cost of doing business. Profit and high margins are not the objective. Getting market share is what is important.

Amazon on the other hand has been reaping the higher margins earned from recurring customers. So Google's entrance into the early product life cycle phase is making Amazon eat into their margins and is forcing down prices over all.

But there's a floor to how low prices can go. Alen Peacock, co-founder of Space Monkey has an interesting graphic telling the story. This is Amazon's historical pricing for 1TB of storage in S3, graphed as a multiple of the historical pricing for 1TB of local hard disk:

Alen explains it this way:

Cloud prices do decrease over time, and have dropped significantly over the timespan shown in the graph, but this graph shows cloud storage prices as a multiple of hard disk prices. In other words, hard disk prices are dropping much faster than datacenter prices. This is because, right, datacenters have costs other than hard disks (power, cooling, bandwidth, building infrastructure, diesel backup generators, battery backup systems, fire suppression, staffing, etc). Most of those costs do not follow Moore's Law -- in fact energy costs are on a long trend upwards. So over time, the gap shown by the graph should continue to widen.

 

The economic advantages of building your own (but housed in datacenters) is there, but it isn't huge. There is also some long term strategic advantage to building your own, e.g., GDrive dropped their price dramatically at will because Google owns their datacenters, but Dropbox couldn't do that without convincing Amazon to drop the price they pay for S3.

Costs other than hardware began dominating in datacenters several years ago, Moore's Law-like effects are dampened. Energy/cooling and cooling costs do not follow Moore's Law, and those costs make up a huge component of the overall picture in datacenters. This is only going to get worse, barring some radical new energy production technology arriving on the scene.

What we're [Space Monkey] interested in, long term, is dropping the floor out from underneath all of these, and I think that only happens if you get out of the datacenter entirely.

As the size of cloud market is still growing there will still be a fight for market share. When growth slows and the market is divided between major players a collusionary pricing phase will take over. Cloud customers are sticky customers. It's hard to move off a cloud. The need for higher margins to justify the cash flow drain during the customer acquisition phase will reverse the favorable trends we are seeing now.

Until then it seems the economics indicate we are in a rent, not a buy world.

Related Articles 
  • IaaS Series: Cloud Storage Pricing – How Low Can They Go? - "For now it seems we can assume we’ve not seen the last of the big price reductions."
  • The Cloud Is Not Green
  • Brad Stone: “Bill Miller, the chief investment officer at Legg Mason Capital Management and a major Amazon shareholder, asked Bezos at the time about the profitability prospects for AWS. Bezos predicted they would be good over the long term but said that he didn’t want to repeat “Steve Jobs’s mistake” of pricing the iPhone in a way that was so fantastically profitable that the smartphone market became a magnet for competition.” 
Categories: Architecture

Sponsored Post: Apple, Cloudant, CopperEgg, Logentries, Wargaming.net, PagerDuty, HelloSign, CrowdStrike, Gengo, ScaleOut Software, Couchbase, MongoDB, BlueStripe, AiScaler, Aerospike, LogicMonitor, AppDynamics, ManageEngine, Site24x7

Tue, 05/13/2014 - 16:56

Who's Hiring?

  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here.
    • Enterprise Software Engineer. Apple's Emerging Technology Services group provides a Java based SOA platform for various applications to interact with each other. The platform is designed to handle millions of messages a day with very low latency. We have an immediate opening for a talented Software Engineer in a highly visible team who is passionate about exploring emerging technologies to create elegant scalable solutions. Please apply here
    • Mobile Services Software Engineer. The Emerging Technologies/Mobile Services team is looking for a proactive and hardworking software engineer to join our team. The team is responsible for a variety of high quality and high performing mobile services and applications for internal use. Please apply here
    • Sr. Software Engineer-iOS Systems. Do you love building highly scalable, distributed web applications? Does the idea of performance tuning Java applications make your heart leap? If so, iOS Systems is looking for a highly motivated, detail-oriented, energetic individual with excellent written and oral skills who is not afraid to think outside the box and question assumptions. Please apply here
    • Senior Software Engineering Manager. As a Senior Software Engineering Manager on our team, you will be managing teams of very dedicated and talented engineering team. You will be responsible for managing the development of mobile point of sale system on iPod touch hardware. Please apply here.
    • Sr Software Engineer - Messaging Services. An exciting opportunity for a Software Engineer to join Apple's Messaging Services team. We build the cloud systems that power some of the busiest applications in the world, including iMessage, FaceTime and Apple Push Notifications. We handle hundreds of millions of active users using some of the most desirable devices on the planet and several Billion iMesssages/day, 40 billion push notifications/day, 16+ trillion push notifications sent to date. Please apply here.

  • Engine Programmer - C/C++. Wargaming|BigWorld is seeking Engine Programmers to join our team in Sydney, Australia. We offer a relocation package, Australian working visa & great salary + bonus. Your primary responsibility will be to work on our PC engine. Please apply here

  • Senior Engineer wanted for large scale, security oriented distributed systems application that offers career growth and independent work environment. Use your talents for good instead of getting people to click ads at CrowdStrike. Please apply here.

  • Ops Engineer - Are you passionate about scaling and automating cloud-based websites? Love Puppet and deployment scripts? Want to take advantage of both your sys-admin and DevOps skills? Join HelloSign as our second Ops Engineer and help us scale as we grow! Apply at http://www.hellosign.com/info/jobs

  • Human Translation Platform Gengo Seeks Sr. DevOps Engineer. Build an infrastructure capable of handling billions of translation jobs, worked on by tens of thousands of qualified translators. If you love playing with Amazon’s AWS, understand the challenges behind release-engineering, and get a kick out of analyzing log data for performance bottlenecks, please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events

  • The Biggest MongoDB Event Ever Is On. Will You Be There? Join us in New York City June 23-25 for MongoDB World! The conference lineup includes Amazon CTO Werner Vogels and Cloudera Co-Founder Mike Olson for keynote addresses.  You’ll walk away with everything you need to know to build and manage modern applications. Register before April 4 to take advantage of super early bird pricing.

  • Upcoming Webinar: Practical Guide to SQL - NoSQL Migration. Avoid common pitfalls of NoSQL deployment with the best practices in this May 8 webinar with Anton Yazovskiy of Thumbtack Technology. He will review key questions to ask before migration, and differences in data modeling and architectural approaches. Finally, he will walk you through a typical application based on RDBMS and will migrate it to NoSQL step by step. Register for the webinar.
Cool Products and Services
  • The NoSQL "Family Tree" from Cloudant explains the NoSQL product landscape using an infographic. The highlights: NoSQL arose from "Big Data" (before it was called "Big Data"); NoSQL is not "One Size Fits All"; Vendor-driven versus Community-driven NoSQL.  Create a free Cloudant account and start the NoSQL goodness

  • Finally, log management and analytics can be easy, accessible across your team, and provide deep insights into data that matters across the business - from development, to operations, to business analytics. Create your free Logentries account here.

  • CopperEgg. Simple, Affordable Cloud Monitoring. CopperEgg gives you instant visibility into all of your cloud-hosted servers and applications. Cloud monitoring has never been so easy: lightweight, elastic monitoring; root cause analysis; data visualization; smart alerts. Get Started Now.

  • PagerDuty helps operations and DevOps engineers resolve problems as quickly as possible. By aggregating errors from all your IT monitoring tools, and allowing easy on-call scheduling that ensures the right alerts reach the right people, PagerDuty increases uptime and reduces on-call burnout—so that you only wake up when you have to. Thousands of companies rely on PagerDuty, including Netflix, Etsy, Heroku, and Github.

  • Aerospike Releases Client SDK for Node.js 0.10.x. This client makes it easy to build applications in Node.js that need to store and retrieve data from a high-performance Aerospike cluster. This version exposes Key-Value Store functionality - which is the core of Aerospike's In-Memory NoSQL Database. Platforms supported: CentOS 6, RHEL 6, Debian 6, Debian7, Mac OS X, Ubuntu 12.04. Write your first app: https://github.com/aerospike/aerospike-client-nodejs.

  • consistent: to be, or not to be. That’s the question. Is data in MongoDB consistent? It depends. It’s a trade-off between consistency and performance. However, does performance have to be sacrificed to maintain consistency? more.

  • Do Continuous MapReduce on Live Data? ScaleOut Software's hServer was built to let you hold your daily business data in-memory, update it as it changes, and concurrently run continuous MapReduce tasks on it to analyze it in real-time. We call this "stateful" analysis. To learn more check out hServer.

  • LogicMonitor is the cloud-based IT performance monitoring solution that enables companies to easily and cost-effectively monitor their entire IT infrastructure stack – storage, servers, networks, applications, virtualization, and websites – from the cloud. No firewall changes needed - start monitoring in only 15 minutes utilizing customized dashboards, trending graphs & alerting.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO

Mon, 05/12/2014 - 16:56

This is a guest repost by Venkatesh CM at Architecture Issues Scaling Web Applications.

I will cover architecture issues that show up while scaling and performance tuning large scale web application in this blog.

Lets start by defining few terms to create common understanding and vocabulary. Later on I will go through different issues that pop-up while scaling web application like

  • Architecture bottlenecks
  • Scaling Database
  • CPU Bound Application
  • IO Bound Application

Determining optimal thread pool size of an web application will be covered in next blog.

Performance
Categories: Architecture

Stuff The Internet Says On Scalability For May 9th, 2014

Fri, 05/09/2014 - 16:56

Hey, it's HighScalability time:


NASA captures Guatemala volcano erupting from space 
  • 40,000 exabytes: from now until 2020, the digital universe will about double every two years; $650,000: amount raised by the MaydayPAC in one week.
  • Quotable Quotes:
    • @BenedictEvans: Masayoshi Son: $20m initial investment in Alibaba, current stake worth $58bn.
    • @iamdevloper: I sneezed earlier and Siri compiled it to valid Perl.
    • @cdixon: "There is not enough competition in the last mile market to allow a true market to function" 
    • @PatrickMcFadin: Get ready for some serious server density. AMD is working on K12, brand-new x86 and ARM cores. This plus 8T SSD? 

  • With age comes changing priorities. Facebook is now 10 and has grown up. They are no longer moving fast and breaking things. They are now into the stability thing. Letting developers know they are a stable platform. The play is to get all that beautiful data from developers by being the platform for the Internet. On which an ad platform is built like a castle protecting a river valley. Interesting that Twitter said No! to becoming a platform, turning away developers. What has happened to Twitter's growth? The thought processes that lead to such different conclusions about the future would be interesting to understand.

  • Better than a Tauntaun roasted over an open light saber. An ode to 17 database in 33 minutes - RailsConf2014 by tobyhede. My favorite is "MySQL - The same as PostgreSQL but controlled by an evil overlord." 

  • Well, when you explain it that way...why GNU grep is fast: GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE; GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH BYTE that it *does* look at.

  • How Gilt's Insane Traffic Spikes Pushed It Off Rails To Scala. It's unusual to have your expected traffic pattern to be a 100x spike once a day for 15 minutes, but that's the life of flash sales. Started as a Rails app. That didn't scale. They switched from Java to Scala because the Java system became too monolithic. They also bought into Akka and the whole Reactive platform idea. Architecture is terms of hundreds of microservices. Microservices keep a wall between unrelated services, reduces complexity, and keeps development friction-free.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Categories: Architecture

Update on Disqus: It's Still About Realtime, But Go Demolishes Python

Wed, 05/07/2014 - 16:56

Our last article on Disqus: How Disqus Went Realtime With 165K Messages Per Second And Less Than .2 Seconds Latency, was a little out of date, but the folks at Disqus have been busy implementing, not talking, so we don't know a lot about what they are doing now, but we do have a short update in C1MM and NGINX by John Watson and an article Trying out this Go thing.

So Disqus has grown a bit:

  • 1.3 billion unique visitors
  • 10 billion page views
  • 500 million users engaged in discussions
  • 3 million communities
  • 25 million comments

They are still all about realtime, but Go replaced Python in their Realtime system:

Categories: Architecture

The Quest for Database Scale: the 1 M TPS challenge - Three Design Points and Five common Bottlenecks to avoid

Tue, 05/06/2014 - 16:56

This a guest post by Rajkumar Iyer, a Member of Technical Staff at Aerospike.

About a year ago, Aerospike embarked upon a quest to increase in-memory database performance - 1 Million TPS on a single inexpensive commodity server. NoSQL has the reputation of speed, and we saw great benefit from improving latency and throughput of cacheless architectures. At that time, we took a version of Aerospike delivering about 200K TPS, improved a few things - performance went to 500k TPS - and published the Aerospike 2.0 Community Edition. We then used kernel tuning techniques and published the recipe for how we achieved 1 M TPS on $5k of hardware.

This year we continued the quest. Our goal was to achieve 1 Million database transactions per second per server; more than doubling previous performance. This compares to Cassandra’s boast of 1M TPS on over 300 servers in Google Compute Engine - at a cost of $2 million dollars per year. We achieved this without kernel tuning. 

This article describes the three design points we kept in mind and the five common bottlenecks we avoided to create a simpler recipe you can follow for high performance operations with Aerospike.

Three Design Points
Categories: Architecture

Stuff The Internet Says On Scalability For May 2nd, 2014

Fri, 05/02/2014 - 16:56

Hey, it's HighScalability time:


Google's new POWER8 server motherboard
  • 1 trillion: number of scents your nose can smell; millions of square feet: sprawling new server farms
  • Quotable Quotes:
    • Gideon Lewis-Kraus: As the engineer and writer Alex Payne put it, these startups represent “the field offices of a large distributed workforce assembled by venture capitalists and their associate institutions,” doing low-overhead, low-risk R&D for five corporate giants. In such a system, the real disillusionment isn’t the discovery that you’re unlikely to become a billionaire; it’s the realization that your feeling of autonomy is a fantasy, and that the vast majority of you have been set up to fail by design.
    • @aphyr: I was already sold on immutability, pure functions, combinators, etc. What forced me out of Haskell was the impenetrable, haphazard docs.
    • @monicalent: Going to be migrating away from Rackspace to save some cash. Recommendations? Both Digital Ocean and Linode are half the price for 1GB RAM.
    • Linus Torvalds: So while I'd love for 'make' to be super-efficient, at the same time I'd much rather optimize the kernel to do what make needs really well, and have CPU's that don't take too long either.
    • Joe Landman: when the networking revolution comes, the cheap switches will be the first ones against the wall
    • @etherealmind: Buying public cloud can say I can't afford a house so I'll buy a tent. Because that works just as well, right ?
    • Neil DeGrasse Tyson: The act of doing it perfectly is the measure of it going unnoticed.[....] when it's done perfectly it goes unnoticed or, at best, it's just taken for granted.
  • How we scaled Freshdesk (Part I) – Before Sharding. Requests per week boomed from 2 million to 65 million. They scaled vertically for as long as they could to handle the increased load. Increasing RAM, CPU and I/O. First using read slaves for their heavily read weighted traffic, then assigning queries to particular slaves. Writes still needed to scale, so they turned on MySQL partitioning. Then caching of objects and html partials. They also used different storage engines for different functions. RedShift for analytics and data mining. Redis for state and as a job queue. But in the end they had to embrace the shard.

  • This took some guts. Fouresquare uses data driven decision making to decide to deportalze their app: We looked at the session analysis and saw that only 1 in 20 sessions had both social and discovery. Why not actually just split those apart, because 19 out of 20 times, tapping on one icon or the other, you have satisfied your need completely.

  • The blockchain story is bullshit: Looking at the blockchain from a realist’s standpoint, it is not obvious that there is a need for a worse-performing database, that an unregulated oligarchy has disproportionate power over, that isn’t improved with administrator arbitration. It looks like a technology looking for a problem to solve, rather than a technology created to solve a problem.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Categories: Architecture

Paper: Can Programming Be Liberated From The Von Neumann Style?

Thu, 05/01/2014 - 16:56

Famous computer scientist John Backus, he's the B in BNF(Backus-Naur form) and the creator of Fortran, gave a Turing Award Lecture titled Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs, that has layed out a division in programming that lives long after it was published in 1977. 

It's the now familiar argument for why functional programming is superior:

The assignment statement is the von Neumann bottleneck of programming languages and keeps us thinking in word-at-a-time terms in much the same way the computer's bottleneck does.

...

The second world of conventional programming languages is the world of statements. The primary statement in that world is the assignment statement itself. All the other statements of the language exist in order to make it possible to perform a computation that must be based on this primitive construct: the assignment statement.

Here's a response by Dijkstra A review of the 1977 Turing Award Levgure by John Backus. And here's an interview with Dijkstra.

Great discussion on a recent Hacker News thread and an older thread. Also on Lambda the Ultimate. Nice summary of the project by David Bolton

There's nothing I can really add to the discussion as much smarter people than me have argued this endlessly. Personally, I'm more of a biology than a math and a languages are for communicating with people sort of programmer. So the argument from formal systems have never persuaded me greatly. Doing a proof of a bubble sort in school was quite enough for me. Its applicability to real complex systems has always been in doubt.

The problem of how to best utilize distributed cores is a compelling concern. Though the assumption that parallelism has to be solved at the language level and not how we've done it, at the system level, is not as compelling.

It's a passionate paper and the discussion is equally passionate. While nothing is really solved, if you haven't deep dived into this dialogue across the generations, it's well worth your time to do so. 

Categories: Architecture

10 Tips for Optimizing NGINX and PHP-fpm for High Traffic Sites

Wed, 04/30/2014 - 16:57

Adrian Singer has boiled down 7 years of experience to a set of 10 very useful tips on how to best optimize NGINX and PHP-fpm for high traffic sites:

  1. Switch from TCP to UNIX domain sockets. When communicating to processes on the same machine UNIX sockets have better performance the TCP because there's less copying and fewer context switches.
  2. Adjust Worker Processes. Set the worker_processes in your nginx.conf file to the number of cores your machine has and  increase the number of worker_connections.
  3. Setup upstream load balancing. Multiple upstream backends on the same machine produce higher throughout than a single one.
  4. Disable access log files. Log files on high traffic sites involve a lot of I/O that has to be synchronized across all threads. Can have a big impact.
  5. Enable GZip
  6. Cache information about frequently accessed files
  7. Adjust client timeouts.
  8. Adjust output buffers.
  9. /etc/sysctl.conf tuning.
  10. Monitor. Continually monitor the number of open connections, free memory and number of waiting threads and set alerts if thresholds are breached. Install the NGINX stub_status module.

Please take a look at the original article as it includes excellent configuration file examples.

Categories: Architecture

Sponsored Post: Apple, Wargaming.net, PagerDuty, HelloSign, CrowdStrike, Gengo, ScaleOut Software, Couchbase, Tokutek, MongoDB, BlueStripe, AiScaler, Aerospike, LogicMonitor, AppDynamics, ManageEngine, Site24x7

Tue, 04/29/2014 - 16:56

Who's Hiring?
  • Apple is hiring a Senior Engineer in their Mobile Services team. We seek an accomplished server-side engineer capable of delivering an extraordinary portfolio of features and services based on emerging technologies to our internal customers. Please apply here

  • Apple is hiring a Software Engineer in their Messaging Services team. We build the cloud systems that power some of the busiest applications in the world, including iMessage, FaceTime and Apple Push Notifications. You'll have the opportunity to explore a wide range of technologies, developing the server software that is driving the future of messaging and mobile services. Please apply here.

  • Apple is hiring an Enterprise Software Engineer. Apple's Emerging Technology Services group provides a Java based SOA platform for various applications to interact with each other. The platform is designed to handle millions of messages a day with very low latency. We have an immediate opening for a talented Software Engineer in a highly visible team who is passionate about exploring emerging technologies to create elegant scalable solutions. Please apply here

  • Engine Programmer - C/C++. Wargaming|BigWorld is seeking Engine Programmers to join our team in Sydney, Australia. We offer a relocation package, Australian working visa & great salary + bonus. Your primary responsibility will be to work on our PC engine. Please apply here

  • Senior Engineer wanted for large scale, security oriented distributed systems application that offers career growth and independent work environment. Use your talents for good instead of getting people to click ads at CrowdStrike. Please apply here.

  • Ops Engineer - Are you passionate about scaling and automating cloud-based websites? Love Puppet and deployment scripts? Want to take advantage of both your sys-admin and DevOps skills? Join HelloSign as our second Ops Engineer and help us scale as we grow! Apply at http://www.hellosign.com/info/jobs

  • Human Translation Platform Gengo Seeks Sr. DevOps Engineer. Build an infrastructure capable of handling billions of translation jobs, worked on by tens of thousands of qualified translators. If you love playing with Amazon’s AWS, understand the challenges behind release-engineering, and get a kick out of analyzing log data for performance bottlenecks, please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • The Biggest MongoDB Event Ever Is On. Will You Be There? Join us in New York City June 23-25 for MongoDB World! The conference lineup includes Amazon CTO Werner Vogels and Cloudera Co-Founder Mike Olson for keynote addresses.  You’ll walk away with everything you need to know to build and manage modern applications. Register before April 4 to take advantage of super early bird pricing.

  • Upcoming Webinar: Practical Guide to SQL - NoSQL Migration. Avoid common pitfalls of NoSQL deployment with the best practices in this May 8 webinar with Anton Yazovskiy of Thumbtack Technology. He will review key questions to ask before migration, and differences in data modeling and architectural approaches. Finally, he will walk you through a typical application based on RDBMS and will migrate it to NoSQL step by step. Register for the webinar.
Cool Products and Services
  • PagerDuty helps operations and DevOps engineers resolve problems as quickly as possible. By aggregating errors from all your IT monitoring tools, and allowing easy on-call scheduling that ensures the right alerts reach the right people, PagerDuty increases uptime and reduces on-call burnout—so that you only wake up when you have to. Thousands of companies rely on PagerDuty, including Netflix, Etsy, Heroku, and Github.

  • GigOM Interviews Aerospike at Structure Data 2014 on Application Scalability. Aerospike Technical Marketing Director, Young Paik explains how you can add rocket fuel to your big data application by running the Aerospike database on top of Hadoop for lightning fast user-profile lookups. Watch this interview.

  • Couchbase: NoSQL and the Hybrid Cloud. If a NoSQL database can be deployed on-premise or it can be deployed in the cloud, why can’t it be deployed on-premise and in the cloud? It can, and it should. Read how in this article converting three use cases for hybrid cloud deployments of NoSQL databases: master / slave, cloud burst, and multi-master.

  • Do Continuous MapReduce on Live Data? ScaleOut Software's hServer was built to let you hold your daily business data in-memory, update it as it changes, and concurrently run continuous MapReduce tasks on it to analyze it in real-time. We call this "stateful" analysis. To learn more check out hServer.

  • LogicMonitor is the cloud-based IT performance monitoring solution that enables companies to easily and cost-effectively monitor their entire IT infrastructure stack – storage, servers, networks, applications, virtualization, and websites – from the cloud. No firewall changes needed - start monitoring in only 15 minutes utilizing customized dashboards, trending graphs & alerting.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

How Disqus Went Realtime with 165K Messages Per Second and Less than .2 Seconds Latency

Mon, 04/28/2014 - 17:21

How do you add realtime functionality to a web scale application? That's what Adam Hitchcock, a Software Engineer at Disqus talks about in an excellent talk: Making DISQUS Realtime (slides).

Disqus had to take their commenting system and add realtime capabilities to it. Not something that's easy to do when at the time of the talk (2013) they had had just hit a billion unique visitors a month.

What Disqus developed is a realtime commenting system called “realertime” that was tested to handle 1.5 million concurrently connected users, 45,000 new connections per second, 165,000 messages/second, with less than .2 seconds latency end-to-end.

The nature of a commenting system is that it is IO bound and has a high fanout, that is a comment comes in and must be sent out to a lot of readers. It's a problem very similar to what Twitter must solve

Disqus' solution was quite interesting as was the path to their solution. They tried different architectures but settled on a solution built on Python, Django, Nginx Push Stream Module, and Thoonk, all unified by a flexible pipeline architecture. In the process they we are able to substantially reduce their server count and easily handle high traffic loads.

At one point in the talk Adam asks if a pipelined architecture is a good one? For Disqus messages filtering through a series of transforms is a perfect match. And it's a very old idea. Unix System 5 has long had a Streams capability for creating flexible pipelines architectures. It's an incredibly flexible and powerful way of organizing code.

So let's see how Disqus evolved their realtime commenting architecture and created something both old and new in the process...

Categories: Architecture

Stuff The Internet Says On Scalability For April 25th, 2014

Fri, 04/25/2014 - 16:56

Hey, it's HighScalability time:


New World Record BASE jumping from World's Tallest Building. #crazy
  • 30 billion: total Pinterest pins; 500,000,000: What'sApp users (700 million photos and 100 million videos every single day); 1 billion: Facebook active users on phones and tablets.
  • Quotable Quotes:
    • @jimplush: Google spent 2.3 billion on infrastructure in Q1. Remember that when you say you want to be "the Google of something"
    • Clay Shirky: I think one of the things that happened to the P2P market for infrastructure is that users preference for predictable pricing vs resource-sensitive pricing is so overwhelming that they will overpay to anyone who can promise flat prices. And because the logic of centralization vs decentralization is so price sensitive, I don't think there is any logical reason to assume a broadly stable class of apps, separate from current pricing data for energy, cycles, and storage.
    • @chipchilders: Stop freaking making new projects just for the sake of loose coupling. Damn it people.
    • Benedict Evans (paraphrased): A startup 15 years ago raised 10 million dollars, had 100 people, and a million users. Now you raise a million dollars, have 10 people, and a 100 million users.
    • @francesc: "Go was created for the cloud infrastructure, when we used to call it servers" - @rob_pike at #gophercon
    • @postwait: distributed system: an arbitrarily large state machine w/ "unknown" & "f*cked" states wherein you can't observe the movement between states.
    • @enneff: "In Ruby regular expressions are actually very fast... compared to all the other things you can do." --@derekcollison LOL #gophercon
    • Steve Jobs: This needs to be like magic. Go back, this isn’t magical enough!
    • @jamesurquhart: The complexity isn’t in the tech, it is in the interconnected apps and comps in systems. Managing interconnectedness is managing complexity.
    • Alex Pentland: Put another way, social physics is about how human behavior is driven by the exchange of ideas—how people cooperate to discover, select, and learn strategies and coordinate their actions—rather than how markets are driven by the exchange of money.

  • Steve Jobs with the carrot: This is important, this needs to happen, and you do it. And now this stick: Guess what, you’re Margaret from now on. 

  • A fantastic look at Uplink Latency of WiFi and 4G Networks by Ilya Grigorik: WiFi can deliver low latency first hop if the network is mostly idle. By contrast, 4G networks require coordination between the device and the radio tower for each uplink transfer. First off, latency aside, and regardless of wireless technology, consider the energy costs of your network transfers! Periodic transfers incur high energy overhead due to the need to wake up the radio on each transmission. Second, same periodic transfers also incur high uplink coordination overhead - 4G in particular. In short, don't trickle data. Aggregate your network requests and fire them in one batch: you will reduce energy costs and reduce latency by amortizing scheduling overhead.

  • It's like Sherlock for programmers. How to detect bank loan fraud with graphs : part 2. A fun way to use graph databases, finding criminals by analyzing patterns using graph algorithms. Also, Building a Graph-based Analytics Platform: Part I

  • Cache Invalidation Strategies With Varnish Cache. Good explanation of different techniques: purging, bans, tagging, grace, TTL. It covers the issue of distributing cache invalidations to multiple caches, but it doesn't seem fault tolerant. 

  • To be smarter the brain had to get more social.  A general pattern for intelligence at different scales? Finding turns neuroanatomy on its head: Researchers present new view of myelin: The fact that it is the most evolved neurons, the ones that have expanded dramatically in humans, suggests that what we're seeing might be the "future." As neuronal diversity increases and the brain needs to process more and more complex information, neurons change the way they use myelin to "achieve" more.  It is possible that these profiles of myelination may be giving neurons an opportunity to branch out and 'talk' to neighboring neurons. These long myelin gaps may be needed to increase neuronal communication and synchronize responses across different neurons. < For more on the amazing ways human social networks improve problem solving take a look at Social Physics: How Good Ideas Spread-The Lessons from a New Science.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Categories: Architecture

Here's a 1300 Year Old Solution to Resilience - Rebuild, Rebuild, Rebuild

Wed, 04/23/2014 - 16:56

How is it possible that a wooden Shinto shrine built in the 7th century is still standing? The answer depends on how you answer this philosophical head scratcher: With nearly every cell in your body continually being replaced, are you still the same person?

The Ise Grand Shrine has been in continuous existence for over 1300 years because every twenty years an exact replica has been rebuilt on an adjacent footprint. The former temple is then dismantled.

Now that's resilience. If you want something to last make it a living part of a culture. It's not so much the building that is remade, what is rebuilt and passed down from generation to generation is the meme that the shrine is important and worth preserving. The rest is an unfolding of that imperative.

You can see echoes of this same process in Open Source projects like Linux and the libraries and frameworks that get themselves reconstructed in each new environment.

The patterns of recurrence in software are the result of Darwinian selection process that keeps simplicity and value alive in human minds. 

A blog post on Persuing Wabi has some fabulous photos of the shrine along with a brief description of why it's the way it is:

Categories: Architecture

This is why Microsoft won. And why they lost.

Mon, 04/21/2014 - 16:56

My favorite kind of histories are those told from an insider's perspective. The story of Richard the Lionheart is full of great battles and dynastic intrigue. The story of one of his soldiers, not so much. Yet the soldiers' story, as someone who has experienced the real consequences of decisions made and actions taken, is more revealing.

We get such a history in Chat Wars, a wonderful article written by David Auerbach, who in 1998 worked at Microsoft on MSN Messenger Service, Microsoft’s instant messaging app (for a related story see The Rise and Fall of AIM, the Breakthrough AOL Never Wanted).

It's as if Herodotus visited Microsoft and wrote down his experiences. It has that same sort of conversational tone, insightful on-the-ground observations, and facts no outsider might ever believe.

Much of the article is a play-by-play account of the cat and mouse game David plays changing Messenger to track AOL's Instant Messenger protocol changes. AOL repeatedly tried to make it so Messenger could not interoperate with AIM and each time Messenger countered with changes of their own. AOL finally won the game with a radical and unexpected play. A great read for programmers. 

For a general audience David's explanation of how and why Microsoft came to dominance and why they lost that dominance is most revealing. It stares directly into the heart of the entropy that brings everything down in the end.

Why Microsoft Won 
Categories: Architecture