numba numpy计算加速器 官方教程 GPU CUDA配置
官网:http://numba.pydata.org/官方教程:http://numba.pydata.org/numba-doc/latest/user/5minguide.html因为我3.7版本的python(也有可能是其他因素影响)找不到numba.autojit加速了,所以想到官网看看到底发生了什么示例以下代码加速理想from numba import jitimport numpy a
官方教程:http://numba.pydata.org/numba-doc/latest/user/5minguide.html
因为我3.7版本的python(也有可能是其他因素影响)找不到numba.autojit加速了,所以想到官网看看到底发生了什么
示例
以下代码加速理想
from numba import jit
import numpy as np
x = np.arange(100).reshape(10, 10)
@jit(nopython=True) # Set "nopython" mode for best performance, equivalent to @njit
def go_fast(a): # Function is compiled to machine code when called the first time
trace = 0.0
for i in range(a.shape[0]): # Numba likes loops
trace += np.tanh(a[i, i]) # Numba likes NumPy functions
return a + trace # Numba likes NumPy broadcasting
print(go_fast(x))
以下代码加速不理想(函数不能享受numba加速)
from numba import jit
import pandas as pd
x = {'a': [1, 2, 3], 'b': [20, 30, 40]}
@jit
def use_pandas(a): # Function will not benefit from Numba jit
df = pd.DataFrame.from_dict(a) # Numba doesn't know about pd.DataFrame
df += 1 # Numba doesn't understand what this is
return df.cov() # or this!
print(use_pandas(x))
需要注意的是,numba使用函数装饰器来加速函数,第一次执行函数时,会将函数编译成机器码,需要耗费一定时间,以后每次调用函数,就是直接用机器码执行,从而获得加速
所以,一般比较常用的是@njit
或@jit(nopython=True)
(一样的)
其他功能
Numba has quite a few decorators, we’ve seen @jit, but there’s also:
@njit - this is an alias for @jit(nopython=True) as it is so commonly used!
@vectorize - produces NumPy ufunc s (with all the ufunc methods supported). Docs are here.
@guvectorize - produces NumPy generalized ufunc s. Docs are here.
@stencil - declare a function as a kernel for a stencil like operation. Docs are here.
@jitclass - for jit aware classes. Docs are here.
@cfunc - declare a function for use as a native call back (to be called from C/C++ etc). Docs are here.
@overload - register your own implementation of a function for use in nopython mode, e.g. @overload(scipy.special.j0). Docs are here.
Extra options available in some decorators:
parallel = True - enable the automatic parallelization of the function.
fastmath = True - enable fast-math behaviour for the function.
ctypes/cffi/cython interoperability:
cffi - The calling of CFFI functions is supported in nopython mode.
ctypes - The calling of ctypes wrapped functions is supported in nopython mode. .
Cython exported functions are callable.
GPU targets:
Numba can target Nvidia CUDA and (experimentally) AMD ROC GPUs. You can write a kernel in pure Python and have Numba handle the computation and data movement (or do this explicitly). Click for Numba documentation on CUDA or ROC.
http://numba.pydata.org/numba-doc/latest/cuda/index.html#numba-for-cuda-gpus
有点多
不过没有看到@autojit
,莫非是取消了??

欢迎来到由智源人工智能研究院发起的Triton中文社区,这里是一个汇聚了AI开发者、数据科学家、机器学习爱好者以及业界专家的活力平台。我们致力于成为业内领先的Triton技术交流与应用分享的殿堂,为推动人工智能技术的普及与深化应用贡献力量。
更多推荐
所有评论(0)