Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
Syndicate content
Updated: 8 hours 37 min ago

Paper: Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores

Thu, 03/12/2015 - 16:56

How will your OLTP database perform if it had to scale up to 1024 cores? Not very well according to this fascinating paper: Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores, where a few intrepid chaos monkeys report the results of their fiendish experiment. The conclusion: we need a completely redesigned DBMS architecture that is rebuilt from the ground up.

Summary:

Categories: Architecture

Cassandra Migration to EC2

Wed, 03/11/2015 - 16:57

This is a guest post by Tommaso Barbugli the CTO of getstream.io, a web service for building scalable newsfeeds and activity streams.

In January we migrated our entire infrastructure from dedicated servers in Germany to EC2 in the US. The migration included a wide variety of components, web workers, background task workers, RabbitMQ, Postgresql, Redis, Memcached and our Cassandra cluster. Our main requirement was to execute this migration without downtime.

This article covers the migration of our Cassandra cluster. If you’ve never run a Cassandra migration before, you’ll be surprised to see how easy this is. We were able to migrate Cassandra with zero downtime using its awesome multi-data center support. Cassandra allows you to distribute your data in such a way that a complete set of data is guaranteed to be placed on every logical group of nodes (eg. nodes that are on the same data-center, rack, or EC2 regions...). This feature is a perfect fit for migrating data from one data-center to another. Let’s start by introducing the basics of a Cassandra multi-datacenter deployment.

Cassandra, Snitches and Replication strategies
Categories: Architecture

AppLovin: Marketing to Mobile Consumers Worldwide by Processing 30 Billion Requests a Day

Mon, 03/09/2015 - 16:56

This is a guest post from AppLovin's VP of engineering, Basil Shikin, on the infrastructure of its mobile marketing platform. Major brands like Uber, Disney, Yelp and Hotels.com use AppLovin's mobile marketing platform. It processes 30 billion requests a day and 60 terabytes of data a day.

AppLovin's marketing platform provides marketing automation and analytics for brands who want to reach their consumers on mobile. The platform enables brands to use real-time data signals to make effective marketing decisions across one billion mobile consumers worldwide.

Core Stats
  • 30 Billion ad requests per day

  • 300,000 ad requests per second, peaking at 500,000 ad requests per second

  • 5ms average response latency

  • 3 Million events per second

  • 60TB of data processed daily

  • ~1000 servers

  • 9 data centers

  • ~40 reporting dimensions

  • 500,000 metrics data points per minute

  • 1 Pb Spark cluster

  • 15GB/s peak disk writes across all servers

  • 9GB/s peak disk reads across all servers

  • Founded in 2012, AppLovin is headquartered in Palo Alto, with offices in San Francisco, New York, London and Berlin.

 

Technology Stack
Categories: Architecture

The Architecture of Algolia’s Distributed Search Network

Mon, 03/09/2015 - 16:56

Guest post by Julien Lemoine, co-founder & CTO of Algolia, a developer friendly search as a service API.

Algolia started in 2012 as an offline search engine SDK for mobile. At this time we had no idea that within two years we would have built a worldwide distributed search network.

Today Algolia serves more than 2 billion user generated queries per month from 12 regions worldwide, our average server response time is 6.7ms and 90% of queries are answered in less than 15ms. Our unavailability rate on search is below 10-6 which represents less than 3 seconds per month.

The challenges we faced with the offline mobile SDK were technical limitations imposed by the nature of mobile. These challenges forced us to think differently when developing our algorithms because classic server-side approaches would not work.

Our product has evolved greatly since then. We would like to share our experiences with building and scaling our REST API built on top of those algorithms.

We will explain how we are using a distributed consensus for high-availability and synchronization of data in different regions around the world and how we are doing the routing of queries to the closest locations via an anycast DNS.

The data size misconception
Categories: Architecture

Stuff The Internet Says On Scalability For March 6th, 2015

Fri, 03/06/2015 - 17:56

Hey, it's HighScalability time:


The future of technology in one simple graph (via swardley)
  • $50 billion: the worth of AWS (is it low?); 21 petabytes: size of the Internet Archive; 41 million: # of views of posts about a certain dress

  • Quotable Quotes:
    • @bpoetz: programming is awesome if you like feeling dumb and then eventually feeling less dumb but then feeling dumb about something else pretty soon
    • @Steve_Yegge: Saying you don't need debuggers because you have unit tests is like saying you don't need detectives because you have jails.
    • Nasser Manesh: “Some of the things we ran into are issues [with Docker] that developers won’t see on laptops. They only show up on real servers with real BIOSes, 12 disk drives, three NICs, etc., and then they start showing up in a real way. It took quite some time to find and work around these issues.”
    • Guerrilla Mantras Online: Best practices are an admission of failure.
    • Keine Kommentare: Using master-master for MySQL? To be frankly we need to get rid of that architecture. We are skipping the active-active setup and show why master-master even for failover reasons is the wrong decision.
    • Ed Felten: the NSA’s actions in the ‘90s to weaken exportable cryptography boomeranged on the agency, undermining the security of its own site twenty years later.
    • @trisha_gee: "Java is both viable and profitable for low latency" @giltene at #qconlondon
    • @michaelklishin: Saw Eric Brewer trending worldwide. Thought the CAP theorem finally went mainstream. Apparently some Canadian hockey player got traded.
    • @ThomasFrey: Mobile game revenues will grow 16.5% in 2015, to more than $3B
    • John Allspaw: #NoEstimates is an example of something that engineers seem to do a lot, communicating a concept by saying what it’s not.

  • Improved thread handling contention, NDB API receive processing, scans and PK lookups in the data nodes has lead to a monster 200M reads per second in MySQL Cluster 7.4. That's on an impressive hardware configuration to be sure, but it doesn't matter how mighty the hardware if your software can't drive it.

  • Have you heard of this before? LinkedIn shows how to use SDCH, an HTTP/1.1-compatible extension, which reduces the required bandwidth through the use of a dictionary shared between the client and the server, to achieve impressive results: When sdch and gzip are combined, we have seen additional compression as high as 81% on certain files when compared to gzip only. Average additional compression across our static content was about 24%. 

  • Double awesome description of How we [StackExchange] upgrade a live data center. It was an intricate multi-day highly choreographed dance. Toes were stepped on, but there was also great artistry. The result of the new beefier hardware: The decrease on question render times (from approx30-35ms to 10-15ms) is only part of the fun. Great comment thread on reddit.

  • A jaunty exploration of Microservices: What are They and Why Should You Care?  Indix is following Twitter and Netflix by tossing their monolith for microservices. The main idea: Microservices decouples your systems and gives more options and choices to evolve them independently.

  • Wired's new stack: WordPress, PHP, Stylus for CSS, jQuery, starting with React.js, JSON, Vagrant, Gulp for task automation, Git hooks, Lining, GitHub, Jenkins.

  • Have you ever wanted to search Hacker News? Algolia has created a great search engine for HN.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

10 Reasons to Consider a Multi-Model Database

Wed, 03/04/2015 - 17:56

This is a guest post by Nikhil Palekar, Solutions Architect, FoundationDB.

The proliferation of NoSQL databases is a response to the needs of modern applications. Still, not all data can be shoehorned into a particular NoSQL model, which is why so many different database options exist in the market. As a result, organizations are now facing serious database bloat within their infrastructure.

But a new class of database engine recently has emerged that can address the business needs of each of those applications and use cases without also requiring the enterprise to maintain separate systems, software licenses, developers, and administrators.

These multi-model databases can provide a single back end that exposes multiple data models to the applications it supports. In that way, multi-model databases eliminate fragmentation and provide a consistent, well-understood backend that supports many different products and applications. The benefits to the organization are extensive, but some of the most significant benefits include:

1. Consolidation
Categories: Architecture

Sponsored Post: Apple, InMemory.Net, Sentient, Couchbase, VividCortex, Internap, Transversal, MemSQL, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Tue, 03/03/2015 - 17:56

Who's Hiring?
  • Apple is hiring a Software Engineer for Maps Services. The Maps Team is looking for a developer to support and grow some of the core backend services that support Apple Map's Front End Services. Ideal candidate would have experience with system architecture, as well as the design, implementation, and testing of individual components but also be comfortable with multiple scripting languages. Please apply here.

  • Sentient Technologies is hiring several Senior Distributed Systems Engineers and a Senior Distributed Systems QA Engineer. Sentient Technologies, is a privately held company seeking to solve the world’s most complex problems through massively scaled artificial intelligence running on one of the largest distributed compute resources in the world. Help us expand our existing million+ distributed cores to many, many more. Please apply here.

  • Linux Web Server Systems EngineerTransversal. We are seeking an experienced and motivated Linux System Engineer to join our Engineering team. This new role is to design, test, install, and provide ongoing daily support of our information technology systems infrastructure. As an experienced Engineer you will have comprehensive capabilities for understanding hardware/software configurations that comprise system, security, and library management, backup/recovery, operating computer systems in different operating environments, sizing, performance tuning, hardware/software troubleshooting and resource allocation. Apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Rise of the Multi-Model Database. FoundationDB Webinar: March 10th at 1pm EST. Do you want a SQL, JSON, Graph, Time Series, or Key Value database? Or maybe it’s all of them? Not all NoSQL Databases are not created equal. The latest development in this space is the Multi Model Database. Please join FoundationDB for an interactive webinar as we discuss the Rise of the Multi Model Database and what to consider when choosing the right tool for the job.
Cool Products and Services
  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • Top Enterprise Use Cases for NoSQL. Discover how the largest enterprises in the world are leveraging NoSQL in mission-critical applications with real-world success stories. Get the Guide.
    http://info.couchbase.com/HS_SO_Top_10_Enterprise_NoSQL_Use_Cases.html

  • VividCortex Developer edition delivers a groundbreaking performance management solution to startups, open-source projects, nonprofits, and other organizations free of charge. It integrates high-resolution metrics on queries, metrics, processes, databases, and the OS and hardware to deliver an unprecedented level of visibility into production database activity.

  • SQL for Big Data: Price-performance Advantages of Bare Metal. When building your big data infrastructure, price-performance is a critical factor to evaluate. Data-intensive workloads with the capacity to rapidly scale to hundreds of servers can escalate costs beyond your expectations. The inevitable growth of the Internet of Things (IoT) and fast big data will only lead to larger datasets, and a high-performance infrastructure and database platform will be essential to extracting business value while keeping costs under control. Read more.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Aerospike demonstrates RAM-like performance with Google Compute Engine Local SSDs. After scaling to 1 M Writes/Second with 6x fewer servers than Cassandra on Google Compute Engine, we certified Google’s new Local SSDs using the Aerospike Certification Tool for SSDs (ACT) and found RAM-like performance and 15x storage cost savings. Read more.

  • Diagnose server issues from a single tab. The Scalyr log management tool replaces all your monitoring and analysis services with one, so you can pinpoint and resolve issues without juggling multiple tools and tabs. It's a universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs,” but enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – try it free! (See how Scalyr is different if you're looking for a Splunk alternative.)

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Stuff The Internet Says On Scalability For February 27th, 2015

Fri, 02/27/2015 - 17:56

Hey, it's HighScalability time:


Hear ye puny mortal. 1.3 million Earths doth fill our Sun. Whence comes this monster black hole with a mass 12 billion times that of the Sun?

 

  • 1 Terabit of Data per Second: 5G; 1.9 Terabytes:  customer in stadium data usage during the Super Bowl; 1 TB: free each month on Big Query; 100x: reduced power consumption in radio chip

  • Quotable Quotes:
    • Robin Harris: But now that non-volatile memory technology - flash today, plus RRAM tomorrow - has been widely accepted, it is time to build systems that use flash directly instead of through our antique storage stacks. 
    • Sundar Pichai: That’s the essence of what we [Google] get excited about – working on problems for people at scale, which make a big difference in [people’s] lives. 
    • @timoreilly: Facebook is hacked 600,000 times a day. @futurecrimes First thing to do to protect yourself, turn on 2-factor authentication
    • @architectclippy: I see you have a poorly structured monolith. Would you like me to convert it into a poorly structured set of microservices?
    • Poppy Crum: Your brain wants as much as possible to come up with a robust actionable perception of the world and of the information and data that is coming in.
    • @BenedictEvans: Both Google and Facebook killing XMPP.  IM being euthanized just at the time messaging could become a third run-time for the internet
    • @dhh: 4-tier / micro-service architectures are organizational scaling patterns far more than they're tech. 1st rule of distributed systems: Don't.
    • @amcafee: Ex-Etsy seller: "In practical terms, scaling the handmade economy is an impossibility."
    • kurin: If you're behind a LB you can just drain the traffic to the hosts you're about to upgrade. Also, if you're above your SLA... I mean, some dropped queries aren't the end of the world.
    • @WSJ: Facebook’s 5,000+ staff generate $1.36 million each in annual revenue. The key to productivity is custom-built software tools
    • @jaykreps: Software is mostly human capital (in people's heads): losing the team is usually worse than losing the code.
    • Dylan Tweney: Mobile growth is huge, and could surge at least 3x in the next two years
    • Joe Davison: I learned that there is often more to business than meets the eye, and the only way to succeed is to plan ahead and anticipate all contingencies.
    • @etherealmind: Google published 30000 configuration changes to its network in 1 month 

  • What's different about AI this time around? Less hype, more data, more computation. The Believers: It was a stunning result. These neural nets were little different from what existed in the 1980s. This was simple supervised learning. It didn’t even require Hinton’s 2006 breakthrough. It just turned out that no other algorithm scaled up like these nets. "Retrospectively, it was a just a question of the amount of data and the amount of computations," Hinton says.

  • What lesson did Ozgun Erdogan learn while working on a database at Amazon that never saw the light of day? How to Build Your Distributed Database (1/2): This optimized plan has many computations pushed down in the query tree, and only collects a small amount of data. This enables scalability. Much more importantly, this logical plan formalizes how relational algebra operators scale in distributed systems, and why. That's one key takeaway I had from building a distributed database before. In the land of distributed systems, commutativity is king. Model your queries with respect to the king, and they will scale.

  • Replication for resiliency? Nature thought of that. Nibbled? No Problem: Champaign first observed in the 1980s, some plants respond by making more seeds, ultimately benefiting from injury in a phenomenon called overcompensation. More recently, Paige and postdoc Daniel Scholes suspected a role for endoreduplication, in which a cell makes extra copies of its genome without dividing, multiplying its number of chromosome sets, or “ploidy.”

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Deep Learning without Deep Pockets

Wed, 02/25/2015 - 17:56

Now that you’ve transformed your system through successive evolutions of architecture goodness...you've made it cloud native, you now treat a fist full of datacenters as a single computer, you’ve microservicized it, you’ve containerized it, you’re continuously releasing and improving it, you’ve made it reactive, you’ve socialized it, you’ve mobilized it, you’ve Hadoop’ed it, you’ve made it DevOps friendly, and you have real-time dashboards that would make NORAD jealous...what’s next?

Deep learning is what’s next. Making machines that learn. The problem is how?

All the other transformations have been changes good programmers can learn to do. Deep learning is still deep magic. We are waiting for the Hadoop of deep learning to be built.

Until then, if you aren’t Google with Google sized clusters and cloisters of PhDs, what can you do? Greg Corrado, Senior Research Scientist at Google, gave a great presentation at the RE.WORK Deep Learning Summit 2015 (videos) that has some useful suggestions:

Categories: Architecture

HappyPancake: a Retrospective on Building a Simple and Scalable Foundation

Mon, 02/23/2015 - 17:56

This is a guest repost by Rinat Abdullin, who worked on HappyPancake, the largest free dating site in Sweden. Initially written in ASP.NET and MS SQL Database server, it eventually became overly complex and expensive to scale. This is the last post in a nearly two year long series of engaging articles on the evolution of the project. For the complete list please see the end of this article.

Our project at HappyPancake completed this week. We delivered a simple and scalable foundation for the next version of largest free dating web site in Sweden (with presence in Norway and Finland).

Journey

Below is the short map of that journey. It lists technologies and approaches that we evaluated for this project. Yellow regions highlight items which made their way into the final design.

Project Deliverables
Categories: Architecture

Stuff The Internet Says On Scalability For February 20th, 2015

Fri, 02/20/2015 - 17:56

Hey, it's HighScalability time:


Networks are everywhere, they can even help reveal disease connections.
  • trillions: number of photons constantly hitting your eyes; $19 billion: Snapchat valuation;  8.5K: average number of questions asked on Stack Overflow per day
  • Quotable Quotes:
    • @BenedictEvans: End of 2014: 3.75-4bn mobiles ~1.5bn PCs  7-800m consumer PCs 1.2-1.3bn closed Android 4-500m open Android 650-675m iOS 80m Macs, ~75m Linux
    • @JeremiahLee: “Humans only use 10% of their internet.” —@nvcexploder #NodeSummit
    • beguiledfoil: Javavu - The feeling experienced when you see new languages make the same mistakes Java made 20 years ago and momentarily mistake said language for Java.
    • @ewolff: If Conway's Law is so important - are #Microservices more an organizational approach than an architecture?
    • @KentLangley: "Apache Spark Continues to Spread Beyond Hadoop." I would say supplant. 
    • Database Soup: An in-memory database is one which lacks the capability of spilling to disk.
    • Matthew Dillon: 1-2 year SSD wear on build boxes has been minimal.
    • @gwenshap: Except there is one writer and many readers - so schema and validation must be done on ingest. Anywhere else is just shifting responsibility
    • @jaykreps: Startup style of engineering (fail fast & iterate) doesn't work for every domain, esp. databases & financial systems
    • Taulant Ramabaja: Decentralization is not a goal in and of itself, it is a strategy
    • Eli Reisman: Etsy runs more Hadoop jobs by 7am than most companies do all day.
    • Dormando: We're [memcached] not sponsored like redis is. I've only ever lost money on this venture.
    • The Trust Engineers: There are more Facebook users than Catholics.

  • Exponent...The new integration is hardware + software + services. Not services like disk storage, but services  like HomeKit, HealthKit, Siri, Car Play, Apple Pay. Services that touch every part of our lives. Apple doesn't build cars, stores, or information services, it wraps them with an Apple layer that provides the customer with an integrated experience while taking full advantage of modularity. Modularity wrapped with integration. Owning the hardware is a better profit model than sercvices in the cloud.

  • Quite a response to You Don't Like Google's Go Because You Are Small on reddit. A vigorous 500+ comments were written. Golang isn't perfect. How disappointing, so many things are.

  • After making Linux work well on multiple cores that next bump in performance comes from Improving Linux networking performance. It's a hard job. For a 100Gb adapter on 3GHz CPU there are only about 200 CPU cycles to process each packet. Good break down of time budgets for for various instructions. The approach is improved batching at multiple layers of the stack and better memory management, which leads directly into Toward a more efficient slab allocator.

  • The process behind creating a Google Doodle for Alessandro Volta’s 270th Birthday reminds me a lot of the process of making old style illustrations as described in Cartographies of Time: A History of the Timeline. The idea is to encode symbolically as much of the important information as possible in a single diagram. The coded icon of a tiny skull could mean, for example, a king died while on the throne. A single flame could stand for the fall of man. This art is not completely lost with today's need to convey a lot of information on small screens. This sort of compression has advantages: Strass believed that a graphic representation of history held manifold advantages over a textual one: it revealed order, scale, and synchronism simply and without the trouble of memorization and calculation.
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...
Categories: Architecture

Free Book: Is Parallel Programming Hard, And, If So, What Can You Do About It?

Wed, 02/18/2015 - 17:56
The trouble ain’t that people are ignorant: it’s that they know so much that ain’t so." -- Josh Billings

 

Is Parallel Programming Hard? Yes. What Can You Do About It? To answer that, Paul McKenney, Distinguished Engineer at IBM Linux Technology Center, vetran of parallel powerhouses SRI and Sequent, has written an epic 400+ page book: Is Parallel Programming Hard, And, If So, What Can You Do About It? 

The goal of the book? "To help you understand how to design shared-memory parallel programs to perform and scale well with minimal risk to your sanity."

So it's not a book about parallelism in the sense of getting the most out of a distributed system, it's a book in the mechanical-sympathy sense of getting the most out of a single machine.

Some example section titles: Introduction, Alternative to Parallel Programming, What Makes Parallel Programming Hard, Hardware and its Habits, Tools of the Trade, Counting, Partitioning and Synchronized Design, Locking, Data Ownership, Deferred Processing, Data Structures, Validation, Formal Verification, Putting it All Together, Advanced Synchronization, Parallel Real-Time Computing, Ease of Use, Conflicting Visions of the Future.

To get a feel for the kind of things you'll learn in the book, here's an interview where Paul talks about what in parallel programming is the hardest to master:

Categories: Architecture

Sponsored Post: Apple, Sentient, Couchbase, Farmerswife, VividCortex, Internap, SocialRadar, Campanja, Transversal, MemSQL, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Tue, 02/17/2015 - 17:56

Who's Hiring?
  • Apple is hiring a Application Security Engineer. Apple’s Gift Card Engineering group is looking for a software engineer passionate about application security for web applications and REST services. Be part of a team working on challenging and fast paced projects supporting Apple's business by delivering high volume, high performance, and high availability distributed transaction processing systems. Please apply here.

  • Apple is hiring a Software Engineer for Maps Services. The Maps Team is looking for a developer to support and grow some of the core backend services that support Apple Map's Front End Services. Ideal candidate would have experience with system architecture, as well as the design, implementation, and testing of individual components but also be comfortable with multiple scripting languages. Please apply here.

  • Sentient Technologies is hiring several Senior Distributed Systems Engineers and a Senior Distributed Systems QA Engineer. Sentient Technologies, is a privately held company seeking to solve the world’s most complex problems through massively scaled artificial intelligence running on one of the largest distributed compute resources in the world. Help us expand our existing million+ distributed cores to many, many more. Please apply here.

  • Want to be the leader and manager of a cutting-edge cloud deployment? Take charge of an innovative 24x7 web service infrastructure on the AWS Cloud? Join farmerswife on the beautiful island of Mallorca and help create the next generation on project management tools. Please apply here.

  • Senior DevOps EngineerSocialRadar. We are a VC funded startup based in Washington, D.C. operated like our West Coast brethren. We specialize in location-based technology. Since we are rapidly consuming large amounts of location data and monitoring all social networks for location events, we have systems that consume vast amounts of data that need to scale. As our Senior DevOps Engineer you’ll take ownership over that infrastructure and, with your expertise, help us grow and scale both our systems and our team as our adoption continues its rapid growth. Full description and application here.

  • Linux Web Server Systems EngineerTransversal. We are seeking an experienced and motivated Linux System Engineer to join our Engineering team. This new role is to design, test, install, and provide ongoing daily support of our information technology systems infrastructure. As an experienced Engineer you will have comprehensive capabilities for understanding hardware/software configurations that comprise system, security, and library management, backup/recovery, operating computer systems in different operating environments, sizing, performance tuning, hardware/software troubleshooting and resource allocation. Apply here.

  • Campanja is an Internet advertising optimization company born in the cloud and today we are one of the nordics bigger AWS consumers, the time has come for us to the embrace the next generation of cloud infrastructure. We believe in immutable infrastructure, container technology and micro services, we hope to use PaaS when we can get away with it but consume at the IaaS layer when we have to. Please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Rise of the Multi-Model Database. FoundationDB Webinar: March 10th at 1pm EST. Do you want a SQL, JSON, Graph, Time Series, or Key Value database? Or maybe it’s all of them? Not all NoSQL Databases are not created equal. The latest development in this space is the Multi Model Database. Please join FoundationDB for an interactive webinar as we discuss the Rise of the Multi Model Database and what to consider when choosing the right tool for the job.

  • Sign Up for New Aerospike Training Courses.  Aerospike now offers two certified training courses; Aerospike for Developers and Aerospike for Administrators & Operators, to help you get the most out of your deployment.  Find a training course near you. http://www.aerospike.com/aerospike-training/
Cool Products and Services
  • See how LinkedIn uses Couchbase to help power its “Follow” service for 300M+ global users, 24x7. http://info.couchbase.com/14Q4-MKTG-Website-LinkedIn-Scale-LP.html

  • VividCortex Developer edition delivers a groundbreaking performance management solution to startups, open-source projects, nonprofits, and other organizations free of charge. It integrates high-resolution metrics on queries, metrics, processes, databases, and the OS and hardware to deliver an unprecedented level of visibility into production database activity.

  • SQL for Big Data: Price-performance Advantages of Bare Metal. When building your big data infrastructure, price-performance is a critical factor to evaluate. Data-intensive workloads with the capacity to rapidly scale to hundreds of servers can escalate costs beyond your expectations. The inevitable growth of the Internet of Things (IoT) and fast big data will only lead to larger datasets, and a high-performance infrastructure and database platform will be essential to extracting business value while keeping costs under control. Read more.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Aerospike demonstrates RAM-like performance with Google Compute Engine Local SSDs. After scaling to 1 M Writes/Second with 6x fewer servers than Cassandra on Google Compute Engine, we certified Google’s new Local SSDs using the Aerospike Certification Tool for SSDs (ACT) and found RAM-like performance and 15x storage cost savings. Read more.

  • Diagnose server issues from a single tab. The Scalyr log management tool replaces all your monitoring and analysis services with one, so you can pinpoint and resolve issues without juggling multiple tools and tabs. It's a universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs,” but enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – try it free! (See how Scalyr is different if you're looking for a Splunk alternative.)

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture

Scaling Kim Kardashian to 100 Million Page Views

Mon, 02/16/2015 - 18:02

The team at PAPER estimated their article (NSFW) containing pictures of a very naked Kim Kardashian would quickly receive over 100 million page views. The very definition of bursty viral driven traffic.

As a comparison in 2013 it was estimated Google processed over 500 million searches a day. So a nude Kim Kardashian is worth one-fifth of a Google. Strangely, I can believe it.

How did they handle this traffic gold mine? A complete recounting of the unusual behind the scenes story is told by Paul Ford in How PAPER Magazine’s web engineers scaled their back-end for Kim Kardashian (SFW).  (BTW, only one butt pun was made intentionally in this story, all others are serendipity).

What can we learn from the experience? I know what you are thinking. This is just a single static page with a few static pictures. It’s not a complex problem like search or a social network. Shouldn’t any decent CDN be enough to handle that? And you would be correct, but that’s not the whole of the story:

Categories: Architecture

Stuff The Internet Says On Scalability For February 13th, 2015

Fri, 02/13/2015 - 18:02

Hey, it's HighScalability time:


Stunning depiction of every space mission over the past 50 years. (Max Roser)
  • 700 billion: Apple's valuation; 1: Number of lines of code it takes to bring down UK air traffic control; 20: how old that line of code was in years; 2: the problem was of course a never been seen before double failure; 1: atom-thin silicon transistors may mean super-fast computing; a few: how many data points it takes to identify you
  • Quotable Quotes:
    • @EpicureanDeal: The Uber model is everywhere: using the internet to connect infrequent, consumers of ad hoc, spot market services with fragmented suppliers.
    • @awendt: “I’m sorry you learned about transactions and joins in college, but you’ll have to de-normalize for #microservices” – @adrianco #microxchg
    • @samnewman: @adrianco "JSON was 10x faster than XML. Protobufs 10x faster than JSON. Avro same speed as Protobufs, but half the size"
    • @RichardWarburto: Premature Optimization isn't the root of all evil: misunderstood domain models are.
    • @MichaelPisula: Says @ewolff at #microxchg: start big with your microservices, splitting is easier than joining and your architecture will be wrong anyway
    • @MJFKlewitz: "With vertical scaling the problem is you end up giving a lot of money to Larry Ellison" #greatquote @crichardson #microxchg
    • Jenny Rood: Species of ants which differ in size can coexist peacefully, but the insects will chase away similarly sized competing species.
    • Steven Levy: The nonlinear gains that Moore predicted are so mind-bending that it is no wonder that very few were able to bend their minds around it.
    • ntoshev: There seems to be a fundamental trade-off between latency and throughput, with stream processors optimizing for latency and batch processors optimizing for throughput.
    • Sam Altman: Nobody cares if you’re using an Intel Edison or a 555 to blink the LED in the prototype you show them: people care about whether you’ve made something that they want.
    • @swardley: 30-50 years from genesis to industrialisation is about the average these days
    • Alex Clemmer: 84% of a single-threaded 1KB write in Redis is spent in the kernel
    • @allspaw: Psst: while lots of folks hope for fully "autonomous" tech to solve all the world's ills, I'll just be over here getting some work done.
    • @alejandrocrosa: “The database you read from is just a cached view of the event log”
    • @viktorklang: Optimizing for latency (as in "time to serve") will also yield higher throughput. Thank you, Mr Little.
    • rakoo: using GOMAXPROCS doesn't automagically turn your program into a parallel one
    • fluidcruft: Data science manifesto: The purpose of computing is numbers.

  • Is the golden age of the cheap startup over? The Rising Costs Of Scaling A Startup. In San Francisco it is. Twice as expensive in 2014 than it was in 2009. Wages have doubled. Op-ex has doubled. And thus startup round sizes have increased. People and place costs dwarf compute infrastructure cost savings.

  • Just in case you are of the fashionable opinion Perl code must look like line noise, take a gander: Real measurement is hard. Nice, eh?

  • Magic tricks for algorithms. This may prove helpful in your new job as Algorithm Profiler...Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images:  It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects. 

  • Here's the rare Docker nocker. Many reasons why you should stop using Docker. One side: Idea good, implementation leaves something to be desired. Other side: it tastes great and is less filling. Interesting from dacjames: In general, I agree 100%. With the case of Docker, the concept of containerization is more important than the current project. Decoupling the application environment from the infrastructure environment is an immensely valuable paradigm.

  • Bounty Hunter. A job title that conjures up romantic images and dreams of the never was. You can still be one in the digital age. And make some money too. 11 Essential Bug Bounty Programs of 2015. Hundreds of thousands of dollars are available. Good hunting.

  • When algorithms rule the world you are just one weighting factor away from insignificance. Apple,Apps and Algorithmic Glitches. Divination used to be how we attempted to contol the future. Now we attempt to penetrate to the unknowable heart of opaque algorithms with something different...data.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Rescuing an Outsourced Project from Collapse: 8 Problems Found and 8 Lessons Learned

Wed, 02/11/2015 - 17:56

If you are one of those people that think most of the products featured on HighScalability use way too many servers then you'll love this story: 130 VMs serving less than 10,000 users daily were chopped down to just one machine.

Here's the setup. A smallish website was having problems. Users were unhappy. In the balance was not only the product, but the company. The site was built using Angular, Symfony2, Postgres, Redis, Centos, 8 HP blades with 128 G RAM each, two racks, a very large HP 3par storage array, a 1Gbps uplink, and VMWare.

More than enough power for the task at hand. Yet the system couldn't handle the load. What would you do?

That's the story Jacques Mattheij tells in his very entertaining and educational Saving a Project and a Company article.

Jacques says much was right about the website, but time pressure and mismanagement created big problems at the system level. "A single clueless person in a position of trust with non technical management, an outsourced project and a huge budget, what could possibly go wrong?" Sound familiar? 

Problem 1: Virtualization Gone Crazy
Categories: Architecture

Vinted Architecture: Keeping a busy portal stable by deploying several hundred times per day

Mon, 02/09/2015 - 17:56

This is guest post by Nerijus Bendžiūnas and Tomas Varaneckas of Vinted.

Vinted is a peer-to-peer marketplace to sell, buy and swap clothes. It allows members to communicate directly and has the features of a social networking service.

Started in 2008 as a small community for Lithuanian girls, it developed into a worldwide project that serves over 7 million users in 8 different countries, and is growing non-stop, handling over 200 M requests per day.

Stats
Categories: Architecture

Stuff The Internet Says On Scalability For February 6th, 2015

Fri, 02/06/2015 - 18:03

Hey, it's HighScalability time:


What a beautiful example of Moore's law visualized through the evolution of Lara Croft! (from @silenok)
  • $1 million: per day gross of Clash of Clans
  • Quotable Quotes:
    • @dancow: In 45 minutes, the largest trader in U.S. equities went bankrupt because of bad devops
    • @bmdhacks: How to be a 10x engineer: Incur technical debt fast enough to appear 10x as productive as the ten engineers tasked with cleaning it up.
    • @CompSciFact: Scaling poorly: Performance degrades with problem size Poorly scaled: Things change far more rapidly in one direction than others
    • @mikiobraun: Before scaling out, a machine learning person would always try some approximation shortcut to achieve speed up. #cheating #orisit
    • @cshirky: 3/4 If your organization has ever made a significant and unpleasant change based on something you measured, you can probably use more data.
    • @PatrickMcFadin: Service Discovery Overview: ZooKeeper vs. Consul vs. Etcd vs. Eureka 
    • @jaykreps: TIL: Dequeuing a single item in RabbitMQ requires traversing every single item in the queue. Oh my.
    • @Carnage4Life: No single recipe 4 success. Great companies had bad habits; Apple micromanagement, Google random side projects & Facebook used fricking PHP
    • Stubbornly Persistent: although life would persist in the absence of microbes, both the quantity and quality of life would be reduced drastically.

  • At inflection points change the world must. Netflix: In the early days of Netflix streaming, circa 2008, we manually tracked hundreds of metrics, relying on humans to detect problems.  Our approach worked for tens of servers and thousands of devices, but not for the thousands of servers and millions of devices that were in our future.  Complexity and human-reliant approaches don’t scale; simplicity and algorithm-driven approaches do.

  • IBM is turning Watson into a platform, offering 5 new services: Speech to Text, Text to Speech, Visual Recognition, Concept Insights, Tradeoff Analytics. GA probable next month. Good discussion on Hacker News. Most of the services allow for training through feedback. Some question the quality of the services, but it's early days. Pricing is not set. Hopefully it won't suffer from what these next gen deep learning services tend to suffer from: expensivitis. Who can afford $1.00 per 1000 API calls for a mobile app that needs to acquire users? IBM, make it cheap, try for ubiquity. Cool stuff will happen.

  • Looking for that next step in distributed reliability? Look at TLA+. Murat has several articles on TLA+ and is using it his teaching distributed systems class. Oh, TLA stands for Temporal Logic of Actions. Leslie Lamport has many papers on TLA. James Hamilton wrote up their experiences at Amazon using TLA+: Challenges in Designing at Scale: Formal Methods in Building Robust Distributed Systems: TLA+, a formal specification language invented by ACM Turing award winner, Leslie Lamport. TLA+ is based on simple discrete math, basic set theory and predicates with which all engineers are quite familiar. A TLA+ specification simply describes the set of all possible legal behaviors (execution traces) of a system. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Matt Cutts: 10 Lessons Learned from the Early Days of Google

Wed, 02/04/2015 - 18:35


I mainly know of Matt Cutts, long time Google employee (since 2000) and currently head of Google's Webspam team, from his appearances on TwiT with Leo Laporte. On TwiT Matt always comes off as smart, thoughtful, and a really nice guy. This you might expect.

What I didn’t expect is in this talk he gave, Lessons learned from the early days of Google, is that Matt also turns out to be quite funny and a good story teller. The stories he’s telling are about Matt’s early days at Google. He puts a very human face on Google. When you think everything Google does is a calculation made by some behind the scenes AI, Matt reminds us that it’s humans making these decisions and they generally just do the best they can.

The primary theme of the talk is innovation and problem solving through creativity. When you are caught between a rock and a hard place you need to get creative. Question your assumptions. Maybe there’s a creative way to solve your problem?

The talk is short and well worth watching. There are lots of those fun little details that only someone with experience and perspective can give. And there’s lots of wisdom here too. Here’s my gloss on Matt’s talk:

1. Sometimes creativity makes a big difference.
Categories: Architecture

Sponsored Post: Apple, Couchbase, Farmerswife, VividCortex, Internap, SocialRadar, Campanja, Transversal, MemSQL, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Tue, 02/03/2015 - 17:56

Who's Hiring?
  • Apple is hiring a Application Security Engineer. Apple’s Gift Card Engineering group is looking for a software engineer passionate about application security for web applications and REST services. Be part of a team working on challenging and fast paced projects supporting Apple's business by delivering high volume, high performance, and high availability distributed transaction processing systems. Please apply here.

  • Want to be the leader and manager of a cutting-edge cloud deployment? Take charge of an innovative 24x7 web service infrastructure on the AWS Cloud? Join farmerswife on the beautiful island of Mallorca and help create the next generation on project management tools. Please apply here.

  • Senior DevOps EngineerSocialRadar. We are a VC funded startup based in Washington, D.C. operated like our West Coast brethren. We specialize in location-based technology. Since we are rapidly consuming large amounts of location data and monitoring all social networks for location events, we have systems that consume vast amounts of data that need to scale. As our Senior DevOps Engineer you’ll take ownership over that infrastructure and, with your expertise, help us grow and scale both our systems and our team as our adoption continues its rapid growth. Full description and application here.

  • Linux Web Server Systems EngineerTransversal. We are seeking an experienced and motivated Linux System Engineer to join our Engineering team. This new role is to design, test, install, and provide ongoing daily support of our information technology systems infrastructure. As an experienced Engineer you will have comprehensive capabilities for understanding hardware/software configurations that comprise system, security, and library management, backup/recovery, operating computer systems in different operating environments, sizing, performance tuning, hardware/software troubleshooting and resource allocation. Apply here.

  • Campanja is an Internet advertising optimization company born in the cloud and today we are one of the nordics bigger AWS consumers, the time has come for us to the embrace the next generation of cloud infrastructure. We believe in immutable infrastructure, container technology and micro services, we hope to use PaaS when we can get away with it but consume at the IaaS layer when we have to. Please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.
Fun and Informative Events
  • Sign Up for New Aerospike Training Courses.  Aerospike now offers two certified training courses; Aerospike for Developers and Aerospike for Administrators & Operators, to help you get the most out of your deployment.  Find a training course near you. http://www.aerospike.com/aerospike-training/
Cool Products and Services
  • See how LinkedIn uses Couchbase to help power its “Follow” service for 300M+ global users, 24x7.  http://info.couchbase.com/14Q4-MKTG-Website-LinkedIn-Scale-LP.html

  • VividCortex is a hosted (SaaS) database performance management platform that provides unparalleled insight and query-level analysis for both MySQL and PostgreSQL servers at micro-second detail. It's not just another tool to draw time-series charts from status counters. It's deep analysis of every metric, every process, and every query on your systems, stitched together with statistics and data visualization. Start a free trial today with our famous 15-second installation.

  • SQL for Big Data: Price-performance Advantages of Bare Metal. When building your big data infrastructure, price-performance is a critical factor to evaluate. Data-intensive workloads with the capacity to rapidly scale to hundreds of servers can escalate costs beyond your expectations. The inevitable growth of the Internet of Things (IoT) and fast big data will only lead to larger datasets, and a high-performance infrastructure and database platform will be essential to extracting business value while keeping costs under control. Read more.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Aerospike demonstrates RAM-like performance with Google Compute Engine Local SSDs. After scaling to 1 M Writes/Second with 6x fewer servers than Cassandra on Google Compute Engine, we certified Google’s new Local SSDs using the Aerospike Certification Tool for SSDs (ACT) and found RAM-like performance and 15x storage cost savings. Read more.

  • FoundationDB 3.0. 3.0 makes the power of a multi-model, ACID transactional database available to a set of new connected device apps that are generating data at previously unheard of speed. It is the fastest, most scalable, transactional database in the cloud - A 32 machine cluster running on Amazon EC2 sustained more than 14M random operations per second.

  • Diagnose server issues from a single tab. The Scalyr log management tool replaces all your monitoring and analysis services with one, so you can pinpoint and resolve issues without juggling multiple tools and tabs. It's a universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs,” but enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – try it free! (See how Scalyr is different if you're looking for a Splunk alternative.)

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Categories: Architecture