{ "metadata": { "name": "", "signature": "sha256:b3ed9662f212cb00ac567f47f67a21b550e50871ef6e53e712dd50b2e83c6e6b" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Raw strings" ] }, { "cell_type": "code", "collapsed": false, "input": [ "len('\\\\')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "1" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "len(r'\\\\')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "2" ] } ], "prompt_number": 2 }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "\u041c\u043e\u0434\u0443\u043b\u044c re" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import re" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "```re.fullmatch?```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Type: function\n", "String form: <function fullmatch at 0x103e00730>\n", "File: /opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/re.py\n", "Definition: re.fullmatch(pattern, string, flags=0)\n", "Docstring:\n", "Try to apply the pattern to all of the string, returning\n", "a match object, or None if no match was found." ] }, { "cell_type": "code", "collapsed": false, "input": [ "re.fullmatch(r'a*bb*a((a|b)b*a)*', 'aaaba') is not None" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "True" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "re.fullmatch(r'a*bb*a((a|b)b*a)*', 'aaabab') is not None" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "False" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "re.fullmatch(r'a*bb*a((a|b)b*a)*', 'aaababa') is not None" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "True" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "```re.findall?```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Type: function\n", "String form: <function findall at 0x103e009d8>\n", "File: /opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/re.py\n", "Definition: re.findall(pattern, string, flags=0)\n", "Docstring:\n", "Return a list of all non-overlapping matches in the string.\n", "\n", "If one or more capturing groups are present in the pattern, return\n", "a list of groups; this will be a list of tuples if the pattern\n", "has more than one group.\n", "\n", "Empty matches are included in the result." ] }, { "cell_type": "code", "collapsed": false, "input": [ "re.findall(r'<.*>', '

Title

Subtitle

')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "['

Title

Subtitle

']" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "re.findall(r'<.*?>', '

Title

Subtitle

')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "['

', '

', '

', '

']" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "```re.search?```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Type: function\n", "String form: <function search at 0x103e007b8>\n", "File: /opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/re.py\n", "Definition: re.search(pattern, string, flags=0)\n", "Docstring:\n", "Scan through string looking for a match to the pattern, returning\n", "a match object, or None if no match was found." ] }, { "cell_type": "code", "collapsed": false, "input": [ "re.search(r'([_0-9a-zA-Z.]+)?@([_0-9a-zA-Z.]+)', 'My email is d.v.kornev@gmail.com')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "<_sre.SRE_Match object; span=(12, 32), match='d.v.kornev@gmail.com'>" ] } ], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "mo = re.search(r'([_0-9a-zA-Z.]+)?@([_0-9a-zA-Z.]+)', 'My email is d.v.kornev@gmail.com')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "mo.group()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "'d.v.kornev@gmail.com'" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "mo.group(0)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "'d.v.kornev@gmail.com'" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "mo.group(1)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "'d.v.kornev'" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "mo.group(2)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "'gmail.com'" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "re.fullmatch(r'(.*)cad\\1', 'abracadabra') is not None" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "True" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "re.search(r'.*(?= Newton)', 'Isaac Newton')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ "<_sre.SRE_Match object; span=(0, 5), match='Isaac'>" ] } ], "prompt_number": 16 }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "\u0412\u043e\u043f\u0440\u043e\u0441\u044b \u043f\u0440\u043e\u0438\u0437\u0432\u043e\u0434\u0438\u0442\u0435\u043b\u044c\u043d\u043e\u0441\u0442\u0438" ] }, { "cell_type": "code", "collapsed": false, "input": [ "very_big_data = ['some good string'] * 10000 + ['some bad string'] * 10000" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 }, { "cell_type": "code", "collapsed": false, "input": [ "len(very_big_data)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "20000" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "def count_good_strings(data):\n", " count = 0\n", " for string in data:\n", " if re.search(r'good', string) is not None:\n", " count += 1\n", " return count" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 19 }, { "cell_type": "code", "collapsed": false, "input": [ "count_good_strings(very_big_data)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ "10000" ] } ], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit count_good_strings(very_big_data)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 27.3 ms per loop\n" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "def count_good_strings_fast(data):\n", " regex = re.compile(r'good')\n", " count = 0\n", " for string in data:\n", " if regex.search(string):\n", " count += 1\n", " return count" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "count_good_strings_fast(very_big_data)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 23, "text": [ "10000" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit count_good_strings_fast(very_big_data)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 7.55 ms per loop\n" ] } ], "prompt_number": 24 }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "\u0418\u0437\u0432\u043b\u0435\u0447\u0435\u043d\u0438\u0435 \u0434\u0430\u043d\u043d\u044b\u0445 \u0438 \u0438\u0445 \u0438\u0437\u043c\u0435\u043d\u0435\u043d\u0438\u0435" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import re\n", "import requests\n", "\n", "r = requests.get('http://python.org/')\n", "r.text" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 25, "text": [ "'\\n\\n\\n\\n \\n\\n\\n \\n \\n\\n \\n\\n \\n \\n \\n \\n \\n\\n \\n \\n \\n \\n \\n\\n \\n\\n \\n \\n \\n\\n \\n\\n \\n \\n \\n \\n \\n \\n \\n\\n \\n \\n \\n \\n\\n \\n \\n\\n Welcome to Python.org\\n \\n\\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n \\n\\n \\n\\n \\n\\n \\n \\n\\n \\n \\n \\n\\n\\n\\n\\n
\\n\\n \\n
\\n\\n \\n\\n
\\n\\n \\n
\\n
\\n\\n

\\n \"python™\"\\n

\\n\\n
\\n\\n \\n Menu
\\n
\\n\\n \\n\\n \\n \\n\\n \\n\\n \\n \\n\\n
\\n
\\n \\n
\\n \\n
\\n \\n
\\n\\n
\\n\\n \\n\\n
\\n \\n
\\n\\n \\n\\n
    \\n \\n
  • \\n
    # Python 3: Fibonacci series up to n\\r\\n>>> def fib(n):\\r\\n>>>     a, b = 0, 1\\r\\n>>>     while a < n:\\r\\n>>>         print(a, end=\\' \\')\\r\\n>>>         a, b = b, a+b\\r\\n>>>     print()\\r\\n>>> fib(1000)\\r\\n0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
    \\n

    Functions Defined

    \\r\\n

    The core of extensible programming is defining functions. Python allows mandatory and optional arguments, keyword arguments, and even arbitrary argument lists. More about defining functions in Python 3

    \\n
  • \\n \\n
  • \\n
    # Python 3: List comprehensions\\r\\n>>> fruits = [\\'Banana\\', \\'Apple\\', \\'Lime\\']\\r\\n>>> loud_fruits = [fruit.upper() for fruit in fruits]\\r\\n>>> print(loud_fruits)\\r\\n[\\'BANANA\\', \\'APPLE\\', \\'LIME\\']\\r\\n\\r\\n# List and the enumerate function\\r\\n>>> list(enumerate(fruits))\\r\\n[(0, \\'Banana\\'), (1, \\'Apple\\'), (2, \\'Lime\\')]
    \\n

    Compound Data Types

    \\r\\n

    Lists (known as arrays in other languages) are one of the compound data types that Python understands. Lists can be indexed, sliced and manipulated with other built-in functions. More about lists in Python 3

    \\n
  • \\n \\n
  • \\n
    # Python 3: Simple arithmetic\\r\\n>>> 1 / 2\\r\\n0.5\\r\\n>>> 2 ** 3\\r\\n8\\r\\n>>> 17 / 3  # classic division returns a float\\r\\n5.666666666666667\\r\\n>>> 17 // 3  # floor division\\r\\n5
    \\n

    Intuitive Interpretation

    \\r\\n

    Calculations are simple with Python, and expression syntax is straightforward: the operators +, -, * and / work as expected; parentheses () can be used for grouping. More about simple math functions in Python 3.

    \\n
  • \\n \\n
  • \\n
    # Python 3: Simple output (with Unicode)\\r\\n>>> print(\"Hello, I\\'m Python!\")\\r\\nHello, I\\'m Python!\\r\\n\\r\\n# Input, assignment\\r\\n>>> name = input(\\'What is your name?\\\\n\\')\\r\\n>>> print(\\'Hi, %s.\\' % name)\\r\\nWhat is your name?\\r\\nPython\\r\\nHi, Python.
    \\n

    Quick & Easy to Learn

    \\r\\n

    Experienced programmers in any other language can pick up Python very quickly, and beginners find the clean syntax and indentation structure easy to learn. Whet your appetite with our Python 3 overview.

    \\r\\n
    \\n
  • \\n \\n
  • \\n
    # For loop on a list\\r\\n>>> numbers = [2, 4, 6, 8]\\r\\n>>> product = 1\\r\\n>>> for number in numbers:\\r\\n...    product = product * number\\r\\n... \\r\\n>>> print(\\'The product is:\\', product)\\r\\nThe product is: 384
    \\n

    All the Flow You’d Expect

    \\r\\n

    Python knows the usual control flow statements that other languages speak — if, for, while and range — with some of its own twists, of course. More control flow tools in Python 3

    \\n
  • \\n \\n
\\n
\\n\\n\\n
\\n\\n \\n
\\n

Python is a programming language that lets you work quickly and integrate systems more effectively. Learn More

\\n
\\n\\n\\n
\\n
\\n\\n
\\n \\n
\\n\\n
\\n\\n \\n \\n\\n \\n\\n
\\n\\n
\\n

Get Started

\\r\\n

Whether you\\'re new to programming or an experienced developer, it\\'s easy to learn and use Python.

\\r\\n

Start with our Beginner’s Guide

\\n
\\n\\n
\\n

Download

\\n

Python source code and installers are available for download for all versions! Not sure which version to use? Check here.

\\n

Latest: Python 3.4.3 - Python 2.7.9

\\n
\\n\\n
\\n

Docs

\\r\\n

Documentation for Python\\'s standard library, along with tutorials and guides, are available online.

\\r\\n

docs.python.org

\\n
\\n\\n
\\n

Jobs

\\r\\n

Looking for work or have a Python related position that you\\'re trying to hire for? Our community-run job board is the place to go.

\\r\\n

jobs.python.org

\\n
\\n\\n
\\n\\n
\\n\\n \\n\\n
\\n \\n
\\n \\n

Upcoming Events

\\n

More

\\n \\n \\n
\\n\\n
\\n\\n
\\n\\n
\\n\\n \\n\\n
\\n
\\n

Use Python for…

\\r\\n

More

\\r\\n\\r\\n\\r\\n\\n
\\n
\\n\\n
\\n\\n \\n
\\n\\n

\\n >>> Python Enhancement Proposals (PEPs): The future of Python is discussed here.\\n RSS\\n

\\n\\n\\n \\n \\n
\\n\\n
\\n\\n
\\n \\n

\\r\\n >>> Python Software Foundation\\r\\n

\\r\\n

The mission of the Python Software Foundation is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers. Learn more

\\r\\n

\\r\\n Become a Member\\r\\n Donate to the PSF\\r\\n

\\n
\\n\\n\\n\\n\\n
\\n\\n \\n \\n\\n \\n \\n\\n\\n
\\n
\\n\\n \\n \\n\\n\\n
\\n

Notice: While Javascript is not essential for this website, your interaction with the content will be limited. Please turn Javascript on for the full experience.

\\n
\\n\\n \\n\\n
\\n\\n \\n \\n \\n\\n \\n \\n\\n \\n\\n \\n\\n \\n\\n \\n \\n\\n\\n\\n'" ] } ], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "urls = re.findall(r'