Skip to content

Software Development Blogs: Programming, Software Testing, Agile Project Management

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

High Scalability - Building bigger, faster, more reliable websites
Syndicate content
Updated: 5 hours 4 min ago

Sponsored Post: ButterCMS, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Wed, 03/29/2017 - 16:56

Who's Hiring? 
  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 
Fun and Informative Events
  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Advertise your event here!
Cool Products and Services
  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-based CMS that seamlessly drops into your app or website. Great for blogs, dynamic pages, knowledge bases, and more. Butter works with any language/framework including Ruby, Rails, Node.js, .NET, Python, Django, Flask, React, Angular, Go, PHP, Laravel, Elixir, Phoenix, and Meteor.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Faster Networks + Cheaper Messages => Microservices => Functions => Edge

Mon, 03/27/2017 - 17:25

When Adrian Cockroft—the guy who helped put the loud in Cloud through his energetic evangelism of Cloud Native and Microservice architectures—talks about what’s next, it pays to listen. And you can listen, here’s a fascinating forward looking talk he gave at microXchg 2017: Shrinking Microservices to Functions. It’s typically Cockroftian: understated, thoughtful, and full of insight drawn from experience.

Adrian makes a compelling case that the same technology drivers, faster networking and cheaper messaging, that drove the move to Microservices are now driving the move to Functions.

The payoffs are all those you’ve no doubt heard about Serverless for some time, but Adrian develops them in an interesting way. He traces how architectures have evolved over time. Take a look at my gloss of his talk for more details.

What’s next after Functions? Adrian talks about pushing Lambda functions to the edge. A topic I’m excited about and have been interested in for sometime, though I didn’t quite see it playing out like this.

Datacenters disappear. Functions are not running in an AWS region anymore, code is placed near the customer using a CDN at CDN endpoints. Now you have a fully distributed, at the edge, low latency, milliseconds from the customer way of running code. Now you can build architectures that are partly in the datacenter, partly at the edge, and partly at the customer premises. And since this is AWS, it’s all, of course, built around Lambda. AWS Greengrass and Snowball Edge are peeks into what the future might look like.

There’s a hidden tension here. Once you put code at the edge you violate two of Lambda’s key assumptions: functions are composed using scalable backend services; low latency messaging. The edge will have a high latency path back to services in the datacenter, so how do you make a function based distributed application at the edge? Does edge computing argue for a more retro architecture with fewer messages back to a more monolithic core?

Or does edge computing require something completely different? Here’s one thought as to what that something completely different might look like: Datanet: A New CRDT Database That Let's You Do Bad Bad Things To Distributed Data.

Now, let’s see the future by first taking a tour of the past….

From Monoliths, to Microservices, to Functions
Categories: Architecture

Stuff The Internet Says On Scalability For March 24th, 2017

Fri, 03/24/2017 - 16:56

Hey, it's HighScalability time:

 This is real and oh so eerie. Custom microscope takes a 33 hour time lapse of a tadpole egg dividing.
If you like this sort of Stuff then please support me on Patreon.
  • 40Gbit/s: indoor optical wireless networks; 15%: energy produced by wind in Europe; 5: new tasty particles; 2000: Qubits are easy; 30 minutes: flight time for electric helicopter; 42.9%: of heathen StackOverflowers prefer tabs;

  • Quotable Quotes:
    • @RichRogersIoT: "Did you know? The collective noun for a group of programmers is a merge-conflict." - @omervk
    • @tjholowaychuk: reviewed my dad's company AWS expenses, devs love over-provisioning, by like 90% too, guess that's where "serverless" cost savings come in
    • @karpathy: Nature is evolving ~7 billion ~10 PetaFLOP NI agents in parallel, and has been for ~10M+s of years, in a very realistic simulator. Not fair.
    • @rbranson: This is funny, but legit. Production software tends to be ugly because production is ugly. The ugliness outpaces our ability to abstract it.
    • @joeweinman: @harrietgreen1 : Watson IoT center opened in Munich... $200 million dollar investment; 1000 engineers #ibminterconnect
    • David Gerard: This [IBM Blockchain Service] is bollocks all the way down.
    • digi_owl: Sometimes it seems that the diff between a CPU and a cluster is the suffix put on the latency times.
    • Scott Aaronson: I’m at an It from Qubit meeting at Stanford, where everyone is talking about how to map quantum theories of gravity to quantum circuits acting on finite sets of qubits, and the questions in quantum circuit complexity that are thereby raised.
    • Founder Collective: Firebase didn’t try to do everything at once. Instead, they focused on a few core problems and executed brilliantly. “We built a nice syntax with sugar on top,” says Tamplin. “We made real-time possible and delightful.” It is a reminder that entrepreneurs can rapidly add value to the ecosystem if they really focus.
    • Elizabeth Kolbert: Reason developed not to enable us to solve abstract, logical problems or even to help us draw conclusions from unfamiliar data; rather, it developed to resolve the problems posed by living in collaborative groups. 
    • Western Union: the ‘telephone’ has too many shortcomings to be seriously considered as a means of communication.
    • Arthur Doskow: being fair, being humane may cost money. And this is the real issue with many algorithms. In economists’ terms, the inhumanity associated with an algorithm could be referred to as an externality. 
    • Francis: The point is that even if GPUs will support lower precision data types exclusively for AI, ML and DNN, they will still carry the big overhead of the graphics pipeline, hence lower efficiency than an FPGA (in terms of FLOPS/WATT). The winner? Dedicated AI processors, e.g. Google TPU
    • James Glasnapp: When we move out of the physical space to a technological one, how is the concept of a “line” assessed by the customer who can’t actually see the line? 
    • Frank: On the other hand, if institutionalized slavery still existed, factories would be looking at around $7,500 in annual costs for housing, food and healthcare per “worker”.
    • Baron Schwartz: If anyone thought that NoSQL was just a flare-up and it’s died down now, they were wrong...In my opinion, three important areas where markets aren’t being satisfied by relational technologies are relational and SQL backwardness, time series, and streaming data. 
    • CJefferson: The problem is, people tell me that if I just learn Haskell, Idris, Closure, Coffescript, Rust, C++17, C#, F#, Swift, D, Lua, Scala, Ruby, Python, Lisp, Scheme, Julia, Emacs Lisp, Vimscript, Smalltalk, Tcl, Verilog, Perl, Go... then I'll finally find 'programming nirvana'.
    • @spectatorindex: Scientists had to delete Urban Dictionary's data from the memory of IBM's Watson, because it was learning to swear in its answers.
    • Animats: [Homomorphically Encrypted Deep Learning] is a way for someone to run a trained network on their own machine without being able to extract the parameters of the network. That's DRM.
    • Dino Dai Zovi: Attackers will take the least cost path through an attack graph from their start node to their goal node.
    • @hshaban: JUST IN: Senate votes to repeal web privacy rules, allowing broadband providers to sell customer data w/o consent including browsing history
    • KBZX5000: The biggest problem you face, as a student, when taking a programming course at a University level, is that the commercially applicable part of it is very limited in scope. You tend to become decent at writhing algorithms. A somewhat dubious skill, unless you are extremely gifted in mathematics and / or somehow have access to current or unique hardware IP's (IP as in Intellectual Property).
    • Brian Bailey: The increase in complexity of the power delivery network (PDN) is starting to outpace increases in functional complexity, adding to the already escalating costs of modern chips. With no signs of slowdown, designers have to ensure that overdesign and margining do not eat up all of the profit margin.
    • rbanffy: Those old enough will remember the AS/400 (now called iSeries) computers map all storage to a single address space. You had no disk - you had just an address space that encompassed everything and an OS that dealt with that.
    • @disruptivedean: Biggest source of latency in mobile networks isn't milliseconds in core, it's months or years to get new cell sites / coverage installed
    • Greg Ferro: Why Is 40G Ethernet Obsolete? Short Answer: COST. The primary issue is that 40G Ethernet uses 4x10G signalling lanes. On UTP, 40G uses 4 pairs at 10G each. 
    • @adriaanm: "We chose Scala as the language because we wanted the latest features of Spark, as well as [...] types, closures, immutability [...]"Adriaan Moors added,
    • ajamesm: There's a difference between (A) locking (waiting, really) on access to a critical section (where you spinlock, yield your thread, etc.) and (B) locking the processor to safely execute a synchronization primitive (mutexes/semaphores).
    • @evan2645: "Chaos doesn't cause problems, it reveals them" - @nora_js #SREcon17Americas #SRECon17
    • chrissnell: We've been running large ES clusters here at Revinate for about four years now. I've found the sweet spot to be about 14-16 data nodes, plus three master-only nodes. Right now, we're running them under OpenStack on top of our own bare metal with SAS disks. It works well but I have been working on a plan to migrate them to live under Kubernetes like the rest of our infrastructure. I think the answer is to put them in StatefulSets with local hostPath volumes on SSD.
    • @beaucronin: Major recurring theme of deep learning twitter is how even those 100% dedicated to the field can't keep up with progress.
    • Chris McNab: VPN certificates and keys are often found within and lifted from email, ticketing, and chat services.
    • @bodil: And it took two hours where the Rust version has taken three days and I'm still not sure it works.
    • azirbel: One thing that's generalizable (though maybe obvious) is to explicitly define the SLAs for each microservice. There were a few weeks where we gave ourselves paging errors every time a smaller service had a deploy or went down due to unimportant errors.
    • bigzen: I'm worn out on articles dissing the performance of SQL databases without quoting any hard numbers and then proceeding to replace the systems with no thanks of development in the latest and great tech. I have nothing against spark, but I find it very hard to believe that alarm code is now readable than SQL. In fact, my experience is just the opposite.
    • jhgg: We are experimenting with webworkers to power a very complicated autocomplete and scoring system in our client. So far so good. We're able to keep the UI running at 60fps while we match, score and sort results in a web-worker.
    • DoubleGlazing: NoSQL doesn't reduce development effort. What you gain from not having to worry about modifying schemas and enforcing referential integrity, you lose from having to add more code to your app to check that a DB document has a certain value. In essence you are moving responsibility for data integrity away from the DB and in to your app, something I think is quite dangerous.
    • Const-me: Too bad many computer scientists who write books about those algorithms prefer to view RAM in an old-fashioned way, as fast and byte-addressable.
    • Azur: It always annoys me a bit when tardigrades are described as extremely hardy: they are not. It is ONLY in the desiccated, cryptobiotic, form they are resistant to adverse conditions.
    • rebootthesystem: Hardware engineers can design FPGA-based hardware optimized for ML. A second set of engineers then uses these boards/FPGA's just as they would GPU's. They write code in whatever language to use them as ML co-processors. This second group doesn't have to be composed of hardware engineers. Today someone using a GPU doesn't have to be a hardware engineer who knows how to design a GPU. Same thing.

  • There should be some sort of Metcalfe's law for events. Maybe: the value of a platform is proportional to the square of the number of scriptable events emitted by unconnected services in the system. CloudWatch Events Now Supports AWS Step Functions as a Target@ben11kehoe: This is *really* useful: Automate your incident response processes with bulletproof state machines #aws

  • Cute faux O'Reilly book cover. Solving Imaginary Scaling Issues.

  • Intel's Optane SSD is finally out, though not quite meeting it's initial this will change everything promise, it still might change a lot of things. Intel’s first Optane SSD: 375GB that you can also use as RAM. 10x DRAM latency. 1/1000 NAND latency. 2400MB/s read, 2000MB/s write. 30 full-drive writes per day. 2.5x better density. $4/GB (1/2 RAM cost). 1.5TB capacity. 500k mixed random IOPS. Great random write response. Targeted at power users with big files, like databases. NDAs are still in place so there's more to learn later. PCPerspective: comparing a server with 768GB of DRAM to one with 128GB of DRAM combined with a pair of P4800X's, 80% of the transactions per second were possible (with 1/6th of the DRAM). More impressive was that matrix multiplication of the data saw a 1.1x *increase* in performance. This seems impossible, as Optane is still slower than DRAM, but the key here was that in the case of the DRAM-only configuration, half of the database was hanging off of the 'wrong' CPU.  foboz1: For anyone think that this a solution looking for a problem, think about two things: Big Data and mobile/embedded. Big Data has an endless appetite for large quantities for memory and fast storage; 3D XPoint plays into the memory hierarchy nicely. At the extreme other end of the scale, it may be fast enough to obviate the need for having DRAM+NAND in some applications. raxx7: And 3D XPoint isn't free of limitations yet. RAM has 50-100 ns latency, 50 GB/s bandwidth (128 bit interface) and unlimited write endurance. If 3D XPoint NVDIMM can't deliver this, we'll still need to manage the difference between RAM and 3D XPoint NVDIMM. zogus: The real breakthrough will come, I think, when the OS and applications are re-written so that they no longer assume that a computer's memory consists of a small, fast RAM bank and a huge, slow persistent set of storage--a model that had held true since just about forever. VertexMaster: Given that DRAM is currently an order of magnitude faster (and several orders vs this real-world x-point product) I really have a hard time seeing where this fits in. sologoub: we built a system using Druid as the primary store of reporting data. The setup worked amazingly well with the size/cardinality of the data we had, but was constantly bottlenecked at paging segments in and out of RAM. Economically, we just couldn't justify a system with RAM big enough to hold the primary dataset...I don't have access to the original planning calculations anymore, but 375GB at $1520 would definitely have been a game changer in terms of performance/$, and I suspect be good enough to make the end user feel like the entire dataset was in memory.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Stuff The Internet Says On Scalability For March 17th, 2017

Fri, 03/17/2017 - 17:08

Hey, it's HighScalability time:

 

Can it be a coincidence trapping autonomous cars is exactly how demons are trapped on Supernatural?
If you like this sort of Stuff then please support me on Patreon.
  • billion billion: exascale operations per second; 250ms: connection time saved by zero round trip time resumption; 800 Million: tons of prey eaten by spiders; 90%: accuracy of quantum computer recognizing trees; 80 GB/s: S3 across 2800 simultaneous functions;

  • Quotable Quotes:
    • @GossiTheDog: Here's something to add to your security threat model: backups. Why steal live data and when you can drive away with exact replica?
    • @ThePublicSquare: "California produces 160% of its 1990 manufacturing, but with just 60% of the workers." -@uclaanderson economist Jerry Nickelsburg
    • @rbranson: makes total sense. I have a friend (who is VC-backed) that has stuff in Azure, GCloud, and AWS to maximize the free credits.
    • @AndrewYNg: If not for US govt funding (DARPA, NSF), US wouldn't be an AI leader today. Proposed cuts to science is big step in wrong direction.
    • @CodeWisdom: "To understand a program you must become both the machine and the program." - Alan Perlis 
    • @codemanship: What does it take to achieve Continuous Delivery? 1. Continuous testing. e.g., Google have 4.2M automated tests, run avg of 35x a day
    • @sebastianstadil: Azure Storage services are down. They really are doing everything like AWS.
Categories: Architecture

Architecture of Probot - My Slack and Messenger Bot for Answering Questions

Wed, 03/15/2017 - 17:25

I programmed a thing. It’s called Probot. Probot is a quick and easy way to get high quality answers to your accounting and tax questions. Probot will find a real live expert to answer your question and handle all the details. You can get your questions answered over Facebook Messenger, Slack, or the web. Answers start at $10. That’s the pitch.

Seems like a natural in this new age of bots, doesn’t it? I thought so anyway. Not so much (so far), but more on that later.

I think Probot is interesting enough to cover because it’s a good example of how one programmer--me---can accomplish quite a lot using today’s infrastructure.

All this newfangled cloud/serverless/services stuff does in fact work. I was able to program a system spanning Messenger, Slack, and the web, in a way that is relatively scalabile, available, and affordable, while requiring minimal devops.

Gone are the days of worrying about VPS limits, driving down to a colo site to check on a sick server, or even worrying about auto-scaling clusters of containers/VMs. At least for many use cases.

Many years of programming experience and writing this blog is no protection against making mistakes. I made a lot of stupid stupid mistakes along the way, but I’m happy with what I came up with in the end.

Here’s how Probot works....

Platform
Categories: Architecture

Sponsored Post: Aerospike, Loupe, Clubhouse, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net

Tue, 03/14/2017 - 16:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Your event here!
Cool Products and Services
  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-first CMS that quickly integrates into your app. Rapidly build CMS-powered experiences in any programming language. Great for blogs, marketing pages, knowledge bases, and more. Butter plays well with Ruby, Rails, Node.js, Go, PHP, Laravel, Python, Flask, Django, and more.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Categories: Architecture

Stuff The Internet Says On Scalability For March 10th, 2017

Fri, 03/10/2017 - 17:56

Hey, it's HighScalability time:

 

Darknet is 4x more resilient than the Internet. An apt metaphor? (URV)
If you like this sort of Stuff then please support me on Patreon.
  • > 5 9s: Spanner availability; 200MB: random access from DNA storage; 215 Pbytes/gram: DNA storage; 287,024: Google commits to open source; 42: hours of audio gold; 33: minutes to get back into programming after interruption; 12K: Chinese startups started per day; 35 million: tons of good shipped under Golden Gate Bridge; 209: mph all-electric Corvette; 500: Disney projects in the cloud; 40%: rise in CO2; 

  • Quoteable Quotes:
    • Marc Rogers: Anything man can make man can break
    • @manupaisable: 10% of machines @spotify rebooted every hour because of defunct #docker - war stories by @i_maravic @qconlondon
    • @robertcottrell: “the energy cost of each bitcoin transaction is enough to power 3.17 US households for a day”
    • Eric Schmidt: We put $30 billion into this platform. I know this because I approved it. Why replicate that?
    • dim: It uses p30 technology. Just basic things, gliders and lightweight spaceships. Basically, the design goes top-down: At the very top, there's the clock. It is a 11520 period clock. Note that you need about 10.000 generations to ensure the display is updated appropriately, but the design should still be stable with a clock of smaller period (about 5.000 or so - the clock needs to be multiple of 60).
    • Luke de Oliveira: Most people in AI forget that the hardest part of building a new AI solution or product is not the AI or algorithms — it’s the data collection and labeling. Standard datasets can be used as validation or a good starting point for building a more tailored solution.
    • @violetblue: Did a lot of people not know that the CIA is a spy agency?
    • @viktorklang: Async is not about *performance*—it is about *scalability*. Let your friends know
    • stillsut: The difference is in the old days, you adapted to computer. Now, computer must adapt to you.
    • Eric Brewer: Spanner uses two-phase commit to achieve serializability, but it uses TrueTime for external consistency, consistent reads without locking, and consistent snapshots.
    • Emily Waltz: Nomura’s molecular robot differs in that it is composed entirely of biological and chemical components, moves like a cell, and is controlled by DNA.
    • Chris Anderson: Most of the devices in our life, from our cars to our homes, are “entropic,” which is to say they get worse over time. Every day they become more outmoded. But phones and drones are “negentropic” devices. Because they are connected, they get better, because the value comes from the software, not hardware
    • William Dutton: Most people using the internet are actually more social than those who are not using the internet
    • @swardley:  ... by 2016, you should have dabbled / learn / tested serverless.  "Go IaaS" or "build our biz as a cloud" in 2017 is #facepalm
    • Bradford Cross: The incompetent segment: the incompetent segment isn’t going to get machine learning to work by using APIs. They are going to buy applications that solve much higher level problems. Machine learning will just be part of how they solve the problems.
    • @denormalize: What do we want? Machine readable metadata! When do we want it? ERROR Line 1: Unexpected token `
    • @Ocramius: "And we should get rid of users: users are not pure, since they modify the state of our system" #confoo
    • Morning Paper: The most important overarching lesson from our study is this: a single file-system fault can induce catastrophic outcomes in most modern distributed storage systems. 
    • Linus Torvalds: And if the DRM "maintenance" is about sending me random half-arsed crap in one big pull, I'm just not willing to deal with it. This is like the crazy ARM tree used to be.
    • Shaun McCormick: Technical Debt is a Positive and Necessary Step in software engineering
    • @tdierks: Hello, my name is Tim. I'm a lead at Google with over 30 years coding experience and I need to look up how to get length of a python string.
    • @codinghorror: I colocated a $600 Ali Express mini pc for $15/month and it is 2x faster than "the cloud"
    • @antirez: "Group chat is like being in an all-day meeting with random participants and no agenda".
    • @sriramhere: Wise man once wrote "As flexible as it is, compute in AWS is optimized for the old capex world." @sallamar
    • @wattersjames: AI will come to your company carefully disguised as a lot of ETL and data-pipeline work...
    • ceejayoz: Lambda's billed in 100 millisecond increments. EC2 servers are billed in one hour increments. If you need short tasks that run in bursty workloads, Lambda's (potentially) a no-brainer.
    • @codinghorror: we have not found bare metal colocation to be difficult, with one exception: persistent file storage. That part, strangely, is quite hard.
    • @jbeda: Lesson from 10 years at Google: this is true until it isn't. Sometimes you *can* build a better mouse trap.
    • jfoutz: I agree. It's genius in a Lex Luthor kind of way. If I understood the full scope of the application, I like to think i'd decline to work on that. It's easy to imagine engineers working on small parts of the system, and never really connecting the dots that the whole point is to evade law enforcement.
    • dsr_:  It's harder (but not impossible) to have complete service lossage like this [Slack] in a federated protocol. That's why you didn't hear about the great email collapse of 2006.
    • throw_away_777: I agree that neural nets are state-of-the-art and do quite well on certain types of problems (NLP and vision, which are important problems). But a lot of data is structured (sales, churn, recommendations, etc), and it is so much easier to train an xgboost model than a neural net model. 
    • @GossiTheDog: #Vault7 CIA - Wiki that Wikileaks released is/was on hosted on DEVLAN, the CIA's "dirty" development network - a major architecture error.
    • Alison Gopnik: new studies suggest that both the young and the old may be especially adapted to receive and transmit wisdom. We may have a wider focus and a greater openness to experience when we are young or old than we do in the hurly-burly of feeding, fighting and reproduction that preoccupies our middle years.
    • @pierre: Wow, audacious to say the least. Intentionally flagging authorities to mislead them. It's like the VW emissions code
    • Joan Gamell: Starting with the obvious: the CIA uses JIRA, Confluence and git. Yes, the very same tools you use every day and love/hate. 
    • Chris Baraniuk: The networks of genes in each animal is a bit like the network of neurons in our brains, which suggests they might be "learning" as they go
    • futurePrimitive: Managers seem to think that programming is typing. No. Programming is *thinking*. The stuff that *looks* like work to a manager (energetic typing) only happens after the hard work is done silently in your head.
    • @danielbryantuk: "There is no such thing as a 'stateless' architecture. It's just someone else's problem" @jboner #qconlondon
    • Platypus: There's no panacea for vendor lock-in. Not even open source, but open source alone gets you further than any number of standards that don't cover what really matters or vendor-provided tools that might go away at any moment. It's the first and best tool for dealing with lock-in, even if it's not perfect. 
    • @tpuddle: @cliff_click talk at #qconlondon about fraud detection in financial trades. Searching 1 billion trades a day "is not that big". !
    • @charleshumble: "Something I see in about 95% of the trading data sets is there are a small number of bad guys hammering it." Cliff Click #qconlondon

  • You may not be able to hear doves cry, but you can listen to machines talk. Elevators to be precise. Watch them chat away as they selflessly shuttle to and fro. Yes, it is as exciting as you might imagine. Though probably not very different than the interior dialogue of your average tool.

  • It used to be that winners wrote history. Now victors destroy data. Terabytes of Government Data Copied

  • Battling legacy code seems to be the number one issue on Stack Overflow, as determined by top books mentioned on Stack Overflow. Not surprising. What was surprising is what's not on the list: algorithm books. Books on the craft of programming took top honors. Gratifying, but at odds with current interviewing dogma. The top 10 books: Working Effectively with Legacy Code; Design Patterns; Clean Code; Java concurrency in practice; Domain-driven Design; JavaScript; Patterns of Enterprise Application Architecture;  Code Complete; Refactoring; Head First Design Patterns.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: Architecture

Part 4 of Thinking Serverless —  Addressing Security Issues

Mon, 03/06/2017 - 17:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1 and 2 and 3

This post is the last of a four-part series of that will dive into developing applications in a serverless way. These insights are derived from several years working with hundreds of developers while they built and operated serverless applications and functions.

The platform was the serverless platform from Iron.io but these lessons can also apply to AWS LambdaGoogle Cloud FunctionsAzure Functions, and IBM’s OpenWhisk project.

Arriving at a good definition of cloud IT security is difficult especially in the context of highly scalable distributed systems like those found in serverless platforms. The purpose of this post is to not to provide an exhaustive set of principles but instead highlight areas that developers, architects, and security officers might wish to consider when evaluating or setting up serverless platforms.

Serverless Processing — Similar But Different

High-scale task processing is certainly not a new concept in IT as it has parallels that date back to the days of job processing on mainframes. The abstraction layer provided by serverless process — in combination with large-scale cloud infrastructure and advanced container technologies — does, however, bring about capabilities that are markedly different than even just a few years ago.

By plugging into an serverless computing platforms, developers do not need to provision resources based on current or anticipated loads or put great effort into planning for new projects. Working and thinking at the task level means that developers are not paying for resources they are not using. Also, regardless of the number of projects in production or in development, developers using serverless processing do not have to worry about managing resources or provisioning systems.

While serving as Iron.io’s security officer, I answered a number of security questionnaires from customers. One common theme is that they were all in need of a serious update to bring them forward into this new world. Very few had any accommodation for cloud computing much less serverless processing.

Most questionnaires still viewed servers as persistent entities needing constant care and feeding. They presumed physical resources as opposed to virtualization, autoscaling, shared resources, and separation of concerns. Their questions lack differentiation between data centers and development and operation centers. A few still asked for the ability to physically inspect data centers which is, by and large, not really an option these days. And very few addressed APIs, logging, data persistence, or data retention.

The format of the sections below follows the order found in many of these security questionnaires as well as several cloud security policies. The order has been flipped a bit to start with areas where developers can have an impact. Later sections will address platform and system issues which teams will want to be aware of but are largely in the domain of serverless platforms and infrastructure providers.

Security Topics

Data Security
Categories: Architecture