Python 分布式计算模块 Parallel

GPL
Python
跨平台
2013-11-02
欢哥

Parallel Python是Python进行分布式计算的开源模块,能够将计算压力分布到多核CPU或集群的多台计算机上,能够非常方便的在内网中搭建一个自组织的分布式计算平台。先从多核计算开始,普通的Python应用程序只能够使用一个CPU进程,而通过Parallel Python能够很方便的将计算扩展到多个CPU进程中

示例代码:

#!/usr/bin/python
# File: sum_primes.py
# Author: VItalii Vanovschi
# Desc: This program demonstrates parallel computations with pp module
# It calculates the sum of prime numbers below a given integer in parallel
# Parallel Python Software: http://www.parallelpython.com

import math, sys, time
import pp

def isprime(n):
    """Returns True if n is prime and False otherwise"""
    if not isinstance(n, int):
        raise TypeError("argument passed to is_prime is not of 'int' type")
    if n < 2:
        return False
    if n == 2:
        return True
    max = int(math.ceil(math.sqrt(n)))
    i = 2
    while i <= max:
        if n % i == 0:
            return False
        i += 1
    return True

def sum_primes(n):
    """Calculates sum of all primes below given integer n"""
    return sum([x for x in xrange(2,n) if isprime(x)])

print """Usage: python sum_primes.py [ncpus]
    [ncpus] - the number of workers to run in parallel, 
    if omitted it will be set to the number of processors in the system
"""

# tuple of all parallel python servers to connect with
ppservers = ()
#ppservers = ("10.0.0.1",)

if len(sys.argv) > 1:
    ncpus = int(sys.argv[1])
    # Creates jobserver with ncpus workers
    job_server = pp.Server(ncpus, ppservers=ppservers)
else:
    # Creates jobserver with automatically detected number of workers
    job_server = pp.Server(ppservers=ppservers)

print "Starting pp with", job_server.get_ncpus(), "workers"

# Submit a job of calulating sum_primes(100) for execution. 
# sum_primes - the function
# (100,) - tuple with arguments for sum_primes
# (isprime,) - tuple with functions on which function sum_primes depends
# ("math",) - tuple with module names which must be imported before sum_primes execution
# Execution starts as soon as one of the workers will become available
job1 = job_server.submit(sum_primes, (100,), (isprime,), ("math",))

# Retrieves the result calculated by job1
# The value of job1() is the same as sum_primes(100)
# If the job has not been finished yet, execution will wait here until result is available
result = job1()

print "Sum of primes below 100 is", result

start_time = time.time()

# The following submits 8 jobs and then retrieves the results
inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700)
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,), ("math",))) for input in inputs]
for input, job in jobs:
    print "Sum of primes below", input, "is", job()

print "Time elapsed: ", time.time() - start_time, "s"
job_server.print_stats()

# Parallel Python Software: http://www.parallelpython.com
加载中

评论(1)

renwofei423
renwofei423
有意思,关注~ Parallel

暂无资讯

暂无问答

GNU Parallel指南

GNU Parallel是一个shell工具,为了在一台或多台计算机上并行的执行计算任务。本文简要介绍GNU Parallel的使用,文章主要翻译自:http://www.gnu.org/software/parallel/parallel_tutorial....

2014/06/01 22:05
8.5K
3
聊聊reactive streams的backpressure

## 序 本文主要研究下reactive streams的backpressure ## reactive streams跟传统streams的区别 ``` @Test public void testShowReactiveStreams() throws InterruptedException { Flux.int...

2018/01/14 20:13
315
0
使用 Parallel 提高 Linux 命令行执行效率

你是否有过这种感觉,你的主机运行速度没有预期的那么快?我也曾经有过这种感觉,直到我发现了 GNU Parallel。 GNU Parallel 是一个 shell 工具,可以并行执行任务。它可以解析多种输入,让你...

2018/06/12 15:56
50
0
并行数据文件系统与计算的高性能集成

在分布式数据存储和计算集成中,并行计算和并行数据访问是基本的优化方法,但是如何能让作业和任务高度并行,在实际应用场景中,由于分布式任务调度和数据本地性没有办法做到完美结合,性能会...

2018/10/10 11:52
0
0
Scala’s parallel collections

Scala 2.9 introduced parallel collections, which mirror most of the existing collections with a parallel version. Collections that have been parallelized this way have received ...

2012/01/20 14:45
212
0
并行复制(MTS:enhanced Multi-threaded slave)

> 5.7.2 支持单库增强型多线程slave(多个sql work线程),mariadb 10.0.5支持 - 原理 - slave利用事务组提交的特性(*To provide parallel execution of transactions in the same schema, Mar...

2016/04/20 16:59
345
1
Oracle insert语句慢的背后

问题:用户反映一个insert语句执行很慢。 1、检查表索引并不多,相比表,索引确实有点大,但应该不是问题所在。 SQL> select owner,segment_name,segment_type,bytes/1024/1024/1024 from db...

2018/12/23 01:12
22
0
Spark GraphX 编程指南

# GraphX编程指南 (根据原文编辑:http://udn.yyuap.com/doc/spark-programming-guide-zh-cn/graphx-programming-guide/index.html) GraphX是一个新的(alpha)Spark API,它用于图和并行图(...

2016/08/29 15:31
66
0
Stream.parallel()学习小计

今天想了解一下parallel的工作原理,于是看了一个网上的demo,用parallel来实现一个1+2+3+..+n-1+n的操作,因为parallel背后是使用的jdk7才引入的ForkJoinPool,而ForkJoinPool默认会启动=处...

2018/03/19 17:11
26
0

没有更多内容

加载失败,请刷新页面

返回顶部
顶部