Go vs Python: A Performance Showdown for the Modern Developer
Empirical Analysis of Execution Speed, Memory Utilization, and Concurrency in Real-World Scenarios
Introduction
Go, developed at Google, and Python, a language that has been around since 1991, are both immensely popular among developers. However, when performance becomes a key requirement, the choice of language becomes critical. The performance of an application can significantly affect user experience and operational costs, so choosing a programming language is a crucial step in any performance-sensitive project.
Choosing the right programming language is pivotal for a project's success. This article aims to provide an empirical comparison between Go and Python, focusing on performance metrics like execution speed, memory utilization, and concurrency handling. Based on real code examples and metrics, we make a data-driven case for choosing Go for performance-critical applications.
Methodology
The tests were run on the same system with an Intel Core i7 4th Gen processor, 8GB RAM, and SSD storage to provide a standardized benchmark. The tests for both languages were run three times to average out any inconsistencies and anomalies.
Test Scenarios
CPU-Bound Task: Parsing and processing a large JSON file.
Memory Utilization: Measuring memory footprint when handling large data sets
Concurrency: Running 1000 concurrent tasks that perform two operations: calculating the 20th Fibonacci number (CPU-bound) and simulating a brief I/O delay between 0 and 10 milliseconds (I/O-bound).
In execution speed, memory utilization, and concurrency metrics, lower numbers are better, indicating faster execution and lower resource utilization.
Results and Discussion
CPU-Bound Task Performance: Parsing and Processing a Large JSON File
Go Code
Here, we'll use Go to parse a large JSON file.
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"os"
"time"
)
type Person struct {
Name string `json:"name"`
Age int `json:"age"`
}
func main() {
start := time.Now()
file, _ := os.Open("large_file.json")
defer file.Close()
data, _ := ioutil.ReadAll(file)
var persons []Person
json.Unmarshal(data, &persons)
for _, person := range persons {
fmt.Println("Name:", person.Name, "Age:", person.Age)
}
elapsed := time.Since(start)
fmt.Printf("Time taken: %s\n", elapsed)
}
Python Code
In Python, we use the json
library to perform the same task.
import json
import time
start = time.time()
with open("large_file.json", "r") as file:
data = json.load(file)
for person in data:
print("Name:", person["name"], "Age:", person["age"])
elapsed = time.time() - start
print(f"Time taken: {elapsed}s")
Analysis
Go generally shows better performance due to its compiled nature and optimized CPU usage. Python, being an interpreted language, adds an extra layer of execution that could slow down CPU-bound tasks.
Python took 74 seconds to parse the same json file on the identical hardware setup.
GO | Python |
67 Seconds | 74 Seconds |
Memory Utilization: Handling Large Data Sets
External tools such as top
in Linux can be used to measure the memory footprint of the running processes. This gives a better idea of how much memory each program is consuming.
Go Code
In Go, we can observe the memory footprint when handling a large array of integers.
package main
import (
"fmt"
"runtime"
"time"
)
type NestedStruct struct {
arr [4000]int
next *NestedStruct
}
func main() {
start := time.Now()
head := &NestedStruct{}
curr := head
for i := 0; i < 3999; i++ {
curr.next = &NestedStruct{}
curr = curr.next
}
var mem runtime.MemStats
runtime.ReadMemStats(&mem)
fmt.Printf("Memory Used: %d KB\n", mem.Alloc/1024)
elapsed := time.Since(start).Milliseconds()
fmt.Printf("Time taken: %d ms\n", elapsed)
}
Python Code
import time
import sys
class NestedStruct:
def __init__(self):
self.arr = [0] * 4000
self.next = None
start = time.time()
head = NestedStruct()
curr = head
for i in range(3999):
curr.next = NestedStruct()
curr = curr.next
memory_used = sys.getsizeof(head)
curr = head
while curr:
memory_used += sys.getsizeof(curr.arr) + sys.getsizeof(curr.next)
curr = curr.next
print(f"Memory Used: {memory_used/1024} KB")
elapsed = (time.time() - start) * 1000
print(f"Time taken: {elapsed} ms")
Results and Analysis
GO | Python | |
Memory Used: | 4414 KB | 125406 KB |
Time Taken: | 85 ms | 160 ms |
Go, being statically typed, allows for more efficient memory utilization, especially for large data sets. On the other hand, Python's dynamic typing can sometimes lead to unexpected memory consumption.
Assessing the Fairness of the Comparison
To showcase differences in memory utilization between Go and Python, these examples should be adequate because they use a similar nested data structure in both languages. This creates a fair comparison point.
On the other hand, if you want more complexity to exaggerate the differences in memory usage or time, you could use even more complex data structures, manipulate larger arrays, or increase the depth of nesting.
It's worth noting that Python's
sys.getsizeof()
only provides a shallow size and doesn't account for auxiliary data structures that Python's dynamic typing might use.
Comparative Analysis of Concurrency in Go and Python
This comparison aims to shed light on how Go's goroutines and Python's asyncio handle tasks that are both CPU-bound and I/O-bound. Both languages are tasked with calculating the 20th Fibonacci number (simulating a CPU-bound task) and sleeping for a random short duration (simulating an I/O-bound task), and this process is repeated 1000 times. It's also worth mentioning that Goβs garbage collector has been highly optimized for low-latency and high-concurrency environments, which could further boost its performance in certain scenarios.
The comparison is complicated by the fact that Python's Global Interpreter Lock (GIL) can limit concurrency in CPU-bound tasks, while Go's goroutines can take full advantage of multiple CPU cores.
Go Code
In the Go example, we use a sync.WaitGroup
to manage multiple goroutines. Each goroutine performs two tasks:
Calculate the 20th Fibonacci number:
fib(20)
Sleep for a random duration between 0 and 10 milliseconds:
fakeIO()
Here's the core part of the Go code:
package main
import (
"fmt"
"math/rand"
"sync"
"time"
)
func fib(n int) int {
if n <= 1 {
return n
}
return fib(n-1) + fib(n-2)
}
func fakeIO() {
time.Sleep(time.Millisecond * time.Duration(rand.Intn(10)))
}
func task(wg *sync.WaitGroup) {
defer wg.Done()
fib(20) // Simulate CPU-bound task
fakeIO() // Simulate I/O-bound task
}
func main() {
start := time.Now()
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
wg.Add(1)
go task(&wg)
}
wg.Wait()
elapsed := time.Since(start)
fmt.Printf("Time taken: %s\n", elapsed)
}
GO Time taken: 39.58ms
Python Code
Python code for the same, using its asyncio
library for asynchronous operations.
Similarly, in the Python example, asyncio is used to manage coroutines. Each coroutine performs two tasks:
Calculate the 20th Fibonacci number:
await fib(20)
Sleep for a random duration between 0 and 10 milliseconds:
await fake_io()
import asyncio
import random
import time
async def fib(n):
if n <= 1:
return n
return await fib(n-1) + await fib(n-2)
async def fake_io():
await asyncio.sleep(random.uniform(0, 0.01))
async def task():
await fib(20) # Simulate CPU-bound task
await fake_io() # Simulate I/O-bound task
async def main():
tasks = []
for _ in range(1000):
tasks.append(task())
await asyncio.gather(*tasks)
start = time.perf_counter()
asyncio.run(main())
elapsed = time.perf_counter() - start
print(f"Time taken: {elapsed}s")
Python Time taken: 4.2 s
Analysis
Concurrency Model: Both Go and Python are given similar tasks, making it a fair representation of their capabilities.
Performance: Go is likely to be faster due to the absence of a Global Interpreter Lock (GIL), allowing it to make better use of multi-core CPUs.
Simplicity: While Python's `asyncio` code may be easier to understand, the Go code is more straightforward when it comes to handling both CPU-bound and I/O-bound tasks.
Go's compiled nature allows for machine-level optimizations that are not directly comparable to Python's bytecode interpretation, potentially further enhancing its performance.
Limitations
It's important to note that the scope of this article is restricted to specific performance metrics and may not cover all aspects that could influence the choice between Go and Python. Furthermore, the hardware used for testing is a single configuration and may not represent performance across all types of hardware setups.
Case Studies
Dropbox: From Python to Go
Dropbox initially utilized Python for its backend services. However, as their user base grew, they began encountering performance issues that Python couldn't efficiently handle. They transitioned to Go and reported a three-fold improvement in their request-per-second capabilities.
Twitch: Scalability with Go
Twitch, the popular live-streaming platform, had to manage over a million concurrent connections in their chat services. They adopted Go for its efficiency and were able to handle the high concurrency with minimal resource utilization.
Conclusion and Must-Know Takeaways
In conclusion, when performance optimization is a critical requirement, Go stands out for its exceptional CPU efficiency and memory utilization. On the other hand, Python offers a wide range of libraries and readability that aids rapid development. Both languages have their merits, therefore, your final choice should align closely with the specific needs of your project.