Selenium 结合 ddddocr库 自动化验证码

点击这里,边看视频讲解,边学习以下内容

带带弟弟OCR 库

最近(2023年11月)发现一个 Python 的验证码识别库,名为 带带弟弟OCR,库代码名为 ddddocr

点击这里打开 ddddocr 在 PYPI 的主页

这个库结合Seleium,可以做 网页自动化通过验证码校验。

安装

执行如下命令即可安装 ddddocr

pip3 install ddddocr

当前由于这个库依赖 pillow ,但是这个库没有指定依赖的pillow版本, 目前和最新的 pillow 10 版本不兼容,

需要运行下面的命令,指定安装 pillow 9.5

pip3 install pillow==9.5

使用

大家可以参考PYPI的文档说明使用该库

下面是一段示例代码

# 安装的 pillow 版本 要 9.5, 不能是 10, 否则 会有错误 module 'PIL.Image' has no attribute 'ANTIALIAS'

from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import ddddocr

ocr = ddddocr.DdddOcr()

options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging']) # for ignore warning and error
driver = webdriver.Chrome(options=options)
driver.implicitly_wait(4)
driver.get('http://127.0.0.1/1.html')

time.sleep(2) # 这里出现captcha时间有点长,等待2秒

while True:
    # 获取元素展示内容为图片数据
    pngData = driver.find_element(By.ID,'captcha').screenshot_as_png

    # with open('d:/tmp1.png', 'wb') as f:
    #     f.write(pngData)

    res = ocr.classification(pngData)
    print('验证码是', res)
    
    ch = input('')
    if ch != '':
        break
    
    driver.refresh()



自动化的网页代码如下

<html>
  <head>
    <meta charset="utf-8">
 
	<style>
	input[type=text] {
		padding: 12px 20px;
		display: inline-block;
		border: 1px solid #ccc;
		border-radius: 4px;
		box-sizing: border-box;
	}
	button{
	  background-color: #4CAF50;
		border: none;
		color: white;
		padding: 12px 30px;
		text-decoration: none;
		margin: 4px 2px;
		cursor: pointer;
	}
	canvas{
	  /*prevent interaction with the canvas*/
	  pointer-events:none;
	}
	</style>
</head>  

<body onload="createCaptcha()">
  <form onsubmit="validateCaptcha()">
    <div id="captcha">
    </div>
    <input type="text" placeholder="Captcha" id="cpatchaTextBox"/>
    <button type="submit">Submit</button>
  </form>
</body>

<script>
var code;
function createCaptcha() {
  //clear the contents of captcha div first 
  document.getElementById('captcha').innerHTML = "";
  var charsArray =
  "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ你我他们大小多少";
  var lengthOtp = 6;
  var captcha = [];
  for (var i = 0; i < lengthOtp; i++) {
    //below code will not allow Repetition of Characters
    var index = Math.floor(Math.random() * charsArray.length + 1); //get the next character from the array
    if (captcha.indexOf(charsArray[index]) == -1)
      captcha.push(charsArray[index]);
    else i--;
  }
  var canv = document.createElement("canvas");
  canv.id = "captcha";
  canv.width = 100;
  canv.height = 50;
  var ctx = canv.getContext("2d");
  ctx.font = "25px Georgia";
  ctx.strokeText(captcha.join(""), 0, 30);
  //storing captcha so that can validate you can save it somewhere else according to your specific requirements
  code = captcha.join("");
  document.getElementById("captcha").appendChild(canv); // adds the canvas to the body element
}
function validateCaptcha() {
  event.preventDefault();
  debugger
  if (document.getElementById("cpatchaTextBox").value == code) {
    alert("Valid Captcha")
  }else{
    alert("Invalid Captcha. try Again");
    createCaptcha();
  }
}
</script>
</html>